0% found this document useful (0 votes)
528 views239 pages

USB-drive-DS RT2020 Proc PDF

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 239

Proceedings of the

2020 IEEE/ACM 24th International


Symposium on Distributed Simulation
and Real Time Applications (DS-RT)

September 14-16, 2020


Prague, Czech Republic

Copyright and Reprint Permission: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy
beyond the limit of U.S. copyright law for private use of patrons those articles in this volume that carry a code at the bottom
of the first page, provided the per-copy fee indicated in the code is paid through Copyright Clearance Center, 222 Rosewood
Drive, Danvers, MA 01923. For reprint or republication permission, email to IEEE Copyrights Manager at pubs-
permissions@ieee.org. All rights reserved. Copyright ©2020 by IEEE.
Proceedings of the 2020 IEEE/ACM 24th International Symposium on Distributed Simulation and Real
Time Applications (DS-RT)

ISBN 978-1-7281-7343-6 (XPLORE COMPLIANT), Part Number CFP20186-ART

ISBN 978-1-7281-7342-9 (USB), Part Number CFP20186-USB

Number of pages: 239

1st edition

Processed by: VSB-Technical University of Ostrava

Contact:

Miroslav Voznak, VSB-Technical University of Ostrava, Faculty of Electrical Engineering and Computer
Science, 17. listopadu 2172/15, 708 00 Ostrava, Czech Republic

miroslav.voznak{at}vsb.cz

ii
Editors:

Dusan Maga
Jiri Hajek

Each paper has been reviewed. The responsibility for the content and language of each paper rests
solely on its author(s).

First edition, 2020

DS-RT 2020 is organized by:

VSB-Technical University of Ostrava

Association for Computing Machinery Special Interest Group on Simulation and Modeling - ACM
SIGSIM

IEEE Computer Society

Technical support, inquiries and additional copies

Miroslav Voznak
VSB-Technical University of Ostrava
17. listopadu 2172/15
708 00 Ostrava
Czech Republic
tel.: +420 596 995 940
e-mail: miroslav.voznak{at}vsb.cz
http://ds-rt.com/2020/home#

Copyright and Reprint Permission: Abstracting is permitted with credit to the source. Libraries are
permitted to photocopy beyond the limit of U.S. copyright law for private use of patrons those articles
in this volume that carry a code at the bottom of the first page, provided the per-copy fee indicated in
the code is paid through Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For
reprint or republication permission, email to IEEE Copyrights Manager at pubs-permissions@ieee.org.
All rights reserved. Copyright ©2020 by IEEE.

iii
A message from chairs
A warm welcome to the 2020 IEEE/ACM 24th International Symposium on Distributed Simulation and
Real Time Applications (DS-RT), originally planned to be organized in Prague. The coronavirus
pandemic situation forced us to switch this event to online conferencing. The decision was made after
many discussions. Rules for entering the territory of the Czech Republic have been experiencing
frequent changes since March 2020, and also it is impossible to guarantee adequate conditions in
Prague. We are well aware of the fact that the conference is especially a place for meetings and
discussions; nevertheless, we have done our best to prepare for you an attractive conference program
in the cyberspace.

DS-RT serves as a forum for simulationists from academia, industry and research labs, to present recent
research results that target the growing overlap between large distributed simulations and real-time
applications. A total of about 91 papers have been submitted, of which 10 were withdrawn (some of
them after review), and 24 accepted as regular. In addition to the regular papers, 8 papers have been
accepted as short papers. These 32 papers are divided into eight sessions running sequentially over
three days.

We selected the most popular videoconferencing tool for supporting the event. Nevertheless, each of
the conference days starts with a testing session in which participants are given the possibility to solve
technical issues, if they are any. We also prepared a possibility of watching a live stream from DS-RT
2020 on a private Youtube channel. We expect this option to be the best for participants who do not
want to be connected directly to the conference room. The scheduling of presentations was a more
difficult task this year due to different timezones of speakers, which was considered as a new
parameter to multicriterial planning. Conference sessions are the following: Distributed Simulations;
Scheduling & Simulations; Real-Time Simulations; Cloud, Fog & Edge Computing; Vehicular & Edge
Computing; Secure & Efficient Computing; Simulations & Modelling and the last UAVs & Simulations.
In addition to the sessions above, the program also includes three keynote speeches opening individual
conference days. The first keynote is on "Evolutionary Algorithms and its Use in Modelling and
Simulations of the Complex Systems", given by Prof. Ivan Zelinka from VSB-Technical University of
Ostrava, Czechia. The second conference day starts with keynote on "Modeling & Simulation Based
Framework for Interoperability Driven Enterprise Design" delivered by Prof. Gregory Zacharewicz from
IMT – Mines Ales, France. The last keynote "Stability and Hidden Attractors in the Simulation and
Theoretical Study of Dynamical Models" is given by Prof. Nikolay V. Kuznetsov from Saint-Petersburg
State University, Russia. The best paper award announcement is scheduled to the closing session, when
we also disclose the next venue of IEEE/ACM DS-RT 2021.
In recognition of the paper's quality and originality, the following awards of IEEE/ACM DS-RT 2019
were given last year:

- in category of regular papers

The Best Paper Award to Marco Rapelli, Claudio E. Casetti and Giandomenico Gagliardi for the
paper "TuST: from Raw Data to Vehicular Traffic Simulation in Turin."

The Best Paper Runner-up Award to Shingo Igarashi , Takuya Azumi, Yuto Kitagawa, Tasuku
Ishigooka and Tatsuya Horiguchi for the paper "Multi-rate DAG Scheduling Considering
Communication Contention for NoC-based Embedded Many-core Processor."

iv
- in category of short papers

The Best Short Paper Award to Robert Chodorek, Agnieszka Chodorek and Krzysztof Wajda for
the paper "Media and non-media WebRTC communication between a terrestrial station and a
drone: the case of a flying IoT system to monitor parking."

The Best Short Paper Runner-up Award to Armir Bujari, Jordan Gottardo, Claudio E. Palazzi
and Daniele Ronzani for the paper "Message Dissemination in Urban IoV."

We would like to express our sincere gratitude to the members of organization, steering and program
committees, and also to reviewers, speakers and especially to all authors for their contributions, effort,
and time. Special thanks also go to our sponsors, IEEE Computer Society and ACM SIGSIM, and to the
editorial staff at IEEE Conference Publication Services for their work in producing these proceedings.
Thanks to the VSB-Technical University of Ostrava, CZ and its staff for the management of all duties
related to organizing the conference. Last but not least, we appreciate the provided support from
CESNET (National research and education network association in Czechia) with switching IEEE/ACM
DS-RT 2020 to the virtual conference mode.

IEEE/ACM DS-RT 2020 General Chair

Miroslav Voznak

Technical University of Ostrava, Czech Republic

IEEE/ACM DS-RT 2020 Program Co-Chairs

Floriano De Rango

University of Calabria, Italy

Carlos Tavares Calafate

Universitat Politècnica de València, Spain

In Prague, September 4, 2020.

v
Organizing Committees

General Chair
Miroslav Voznak
VSB – Technical University of Ostrava
Czech Republic
Program Co-Chairs
Carlos Tavares Calafate Floriano De Rango
Universitat Politècnica de València University of Calabria
Spain Italy
Special Sessions Chair
Rodolfo W. L. Coutinho
Concordia University
Canada
Posters/Demo Chair
Hakki Gokhan Ilk
Ankara University, Turkey
Publicity Co-Chairs
Mirela Sechi Notare
Eirini Eleni Tsiropoulou Mauro Tropea
University of Technology in Fly
University of New Mexico University of Calabria
Transportation
Mexico Italy
Brazil
Finance Chair
Lukas Sevcik
Technical University of Ostrava
Publication/Proceedings Co-Chairs
Dusan Maga Mauro Tropea
Czech Technical University in Prague University of Calabria
Czech Republic Italy
Local Organization Chair
Robert Bestak
Czech Technical Univeristy in Prague
Czech Republic
Web Chair
Noura Aljeri
University of Ottawa
Canada

vi
Program Committee
Adeline Urmacher University of Rostock, Germany
Alfredo Garro University of Calabria, Italy
Angelo Furfaro University of Calabria, Italy
Armir Bujari University of Padua, Italy
Andrea D'Ambrogio University of Rome TorVergata, Italy
Chun-Wei Lin Western Norway University of Applied Sciences, Norway
Claudia Campolo Mediterranea University of Reggio Calabria, Italy
Danilo Amendola University of Trieste, Italy
Emanuel Puschita Technical University of Cluj-Napoca, Romania
Enrique Hernández-Orallo Universitat Politècnica de València, Spain
Franco Cicirelli University of Calabria, DIMES, Italy
Francesco Quaglia University of Rome "La Sapienza", Italy
Gabriele D'Angelo University of Bologna, Italy
Gabriel Wainer Carleton University, Canada
Georgios Keramidas Think Silicon, Greece
Giandomenico Spezzano Consiglio Nazionale delle Ricerche (CNR), Italy
Greg Zacharewicz IMT - Mines Ales, France
Hakki Ilk Ankara University, Turkey
Helen Karatza Aristotle University of Thessaloniki, Greece
Hoang-Sy Nguyen Binh Duong University, Vietnam
Iqbal Khan Qualcomm Technologies Inc., USA
Ivan Zelinka Technical University of Ostrava, Czech Republic
Jan Martinovic IT4Innovations, Czech Republic
Jerry Chun-Wei Lin Western Norway University of Applied Sciences, Norway
Juan-Carlos Cano Universidad Politecnica de Valencia, Spain
Libero Nigro UniveUniversity of Calabria, Italy
Marcin Niemiec AGH University of Science and Technology, Poland
Marco Morana University of Palermo, Italy
Mauro Tropea Università della Calabria, Italy
Michal Stepanovsky Czech Technical University in Prague, Czech Republic
Miralem Mehic Univesity of Sarajevo, Bosnia and Herzegovina
Mirko Stoffers RWTH Aachen University, Germany
Pavel Tvrdik Czech Technical University in Prague, Czech Republic
Pierre Siron University of Toulouse, France
Philip Wilsey University of Cincinnati, USA
Pietro Manzoni Universitat Politècnica de València, Spain
Radek Fujdiak Brno University of technology, Czech Republic
Seilendria Hadiwardoyo IMEC
Simon Taylor Brunel University, UK
Stefan Rass Universität Klagenfurt, Austria
Tan Nhat Ngueyn Ton Duc Thang University, Vietnam
Vaclav Snasel Technical University of Ostrava, Czech Republic
Wentong Cai Nanyang Technological University, Singapore

vii
Steering Committee
Azzedine Boukerche University of Ottawa, Canada
Sajal K. Das Missouri University of Science and Technology, USA
Paul Reynolds University of Virginia, USA
Stephen J. Turner KMUTT, Thailand
Albert Zomaya University of Western Australia, Australia
Rodolfo W. L. Coutinho Concordia University, Canada

viii
Table of Contents
Claudia Campolo, Giacomo Genovese, Antonella Molinaro, Bruno Pizzimenti: Digital Twins
at the Edge to Track Mobility for MaaS Applications ...............................................................................1

Martin Drašar, Stephen Moskal, Shanchieh Yang, Pavol Zaťko: Session-Level Adversary
Intent-Driven Cyberattack Simulator .......................................................................................................7

Michael Kyesswa, Philipp Schmurr, Hueseyin Kemal Cakmak, Uwe Kuehnapfel, Veit Hagenmeyer:
A New Julia-Based Parallel Time-Domain Simulation Algorithm for Analysis of Power System
Dynamics ............................................................................................................................................... 16

Anselm Erdmann, Anna Marcellan, Dominik Hering, Michael Suriyah, Carolin Ulbrich, Martin Henke,
André Xhonneux, Dirk Müller, Rutger Schlatmann, Veit Hagenmeyer: On Verification of Designed
Energy Systems Using Distributed Co-Simulations ............................................................................... 25

Mauro Tropea, Abdon Serianni: Bio-Inspired Drones Recruiting Strategy for Precision Agriculture
Domain .................................................................................................................................................. 33

Diego M. Jiménez-Bravo, Pierre Masala Mutombo, Bart Braem, Johann M. Marquez-Barja:


Applying Faster R-CNN in Extremely Low-Resolution Thermal Images for People Detection .............. 37

Alexander Puzicha, Peter Buchholz: Real-Time Simulation of Robot Swarms with Restricted
Communication Skills ............................................................................................................................ 41

Shingo Igarashi, Tasuku Ishigooka, Tatsuya Horiguchi, Ryotaro Koike, Takuya Azumi: Heuristic
Contention-Free Scheduling Algorithm for Multi-core Processor Using LET Model ............................. 49

Maryan Rab, Romolo Marotta, Mauro Ianni, Alessandro Pellegrini, Francesco Quaglia:
NUMA-Aware Non-Blocking Calendar Queue ....................................................................................... 59

Andrea Piccione, Alessandro Pellegrini: Agent-Based Modeling and Simulation for Emergency
Scenarios: A Holistic Approach .............................................................................................................. 68

Nicolas Nevigato, Mauro Tropea, Floriano De Rango: Collision Avoidance Proposal in a MEC Based
VANET Environment .............................................................................................................................. 77

Sung woon Park, Azzedine Boukerche, Shichao Guan: A Novel Deep Reinforcement Learning
Based Service Migration Model for Mobile Edge Computing ............................................................... 84

Diogo Torres, João Pedro Dias, André Restivo, Hugo Ferreira: Real-time Feedback in Node-RED
for IoT Development: An Empirical Study ............................................................................................. 92

Bassirou Ngom, Moussa Diallo, Nicolas Marilleau: MEDART-MAS: MEta-Model of Data


Assimilation on Real-Time Multi-Agent Simulation ............................................................................ 100

Franco Cicirelli, Libero Nigro: Model Checking Actor-based Cyber-Physical Systems ........................ 107

Moritz Gütlein, Wojciech Baron, Christopher Renner, Anatoli Djanatliev: Performance Evaluation
of HLA RTI Implementations................................................................................................................ 115

Tomas Potuzak: Reduction of Inter-process Communication in Distributed Simulation of Road


Traffic................................................................................................................................................... 123

ix
Sergey Suslov, Michael Schiek, Markus Robens, Christian Grewing, Stefan van Waasen: Simulating
Heterogeneous Models on Multi-Core Platforms Using Julia's Computing Language Parallel
Potential .............................................................................................................................................. 133

Alberto Falcone, Alfredo Garro: Pitfalls and Remedies in Modeling and Simulation of Cyber
Physical Systems .................................................................................................................................. 137

Lorenzo Donatiello, Lorenzo Gasparini, Gustavo Marfia: Laying the Path to Consumer-Level
Immersive Simulation Environments .................................................................................................. 142

Emilie Bout, Valeria Loscrí, Antoine Gallais: Energy and Distance Evaluation for Jamming Attacks
in Wireless Networks........................................................................................................................... 146

Awais Aziz Shah, Marco Mussini, Francesco Nicassio, Giorgio Parladori, Francesco Triggiani, Giovanni
Grieco, Giuseppe Iaffaldano, Giuseppe Piro: A Real-Time Simulation Framework for Complex and
Large-Scale Optical Transport Networks Based on the SDN Paradigm ............................................... 151

Franco Cicirelli, Antonio Gentile, Emilio Greco, Antonio Guerrieri, Giandomenico Spezzano,
Andrea Vinci: An Energy Management System at the Edge Based on Reinforcement Learning ........ 155

Jalil Boudjadar, Mohammad Hassan Khooban: A Cost-effective Scheduling Control for a Safety
Critical Hybrid Power System .............................................................................................................. 163

Avinash Maurya, Bogdan Nicolae, Ishan Guliani, M. Mustafa Rafique: CoSim: A Simulator
for Co-Scheduling of Batch and On-Demand Jobs in HPC Datacenters .............................................. 167

Jamie Wubben, Pablo Aznar, Francisco Fabra, Carlos T. Calafate, Juan-Carlos Cano, Pietro Manzoni:
Toward Secure, Efficient, and Seamless Reconfiguration of UAV Swarm Formations ....................... 175

Youssra Cheriguene, Soumia Djellikh, Fatima Zohra Bousbaa, Nasreddine Lagraa, Abderrahmane
Lakas, Chaker Abdelaziz Kerrache, Abdou El Karim Tahari: SEMRP: an Energy-Efficient Multicast
Routing Protocol for UAV Swarms ...................................................................................................... 182

Giovanni Iacovelli, Pietro Boccadoro, Luigi Alfredo Grieco: An Iterative Stochastic Approach to
Constrained Drones' Communications................................................................................................ 190

Nasos Grigoropoulos, Spyros Lalis: Simulation and Digital Twin Support for Managed Drone
Applications ......................................................................................................................................... 198

Alessandro Ciociola, Michele Cocca, Danilo Giordano, Luca Vassio, Marco Mellia:
E-Scooter Sharing: Leveraging Open Data for System Design............................................................. 206

Abubakar Saad, Robson E. De Grande: MDP-based Vehicular Network Connectivity Model


for VCC Management .......................................................................................................................... 214

Peppino Fazio, Miralem Mehic, Pavol Partila, Jaromir Tovarek, Miroslav Voznak: A New Mobility
Samples Encoding Scheme Based on Pairing Functions and Data Analytics....................................... 222

x
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Digital Twins at the Edge to Track Mobility for


MaaS Applications
Claudia Campolo, Giacomo Genovese, Antonella Molinaro, Bruno Pizzimenti
University “Mediterranea” of Reggio Calabria, Italy
claudia.campolokgiacomo.genovesekantonella.molinaro@unirc.it; pzzbrn97b12h224a@studenti.unirc.it

Abstract—The research into wireless communication and mobile offering lower latency and higher context-awareness compared
computing is called to formulate novel smart mobility solutions to the remote cloud.
to improve the quality of a citizen’s life in smart cities. In such In this paper we make a step forward and extend the work in
a context, in this paper we elaborate on the role of technologies
like multi-access edge computing (MEC), Internet of Things (IoT) [3] by providing the following main innovative contributions:
messaging protocols, such as Constrained Application Protocol • We propose the usage of Digital Twins (DTs), acting as
(CoAP) and Message Queue Telemetry Transport (MQTT), and digital counterparts of physical entities, e.g., smartphones
virtualization (i.e., digital twins) in the design of a framework of commuters, On Board Systems (OBSs) of PT vehicles.
enabling the collection and processing of data about the mobility
of commuters and public transport vehicles. Such data have They are in charge of retrieving mobility data of the
the purpose to feed mobility monitoring and transport planning corresponding physical entities and can be queried by
solutions. A Proof-of-Concept (PoC) is developed to validate stakeholders interested to such data.
the framework under realistic experimental settings. Results in • We design DTs as virtualized applications to be hosted
terms of efficiency and effectiveness of the considered messaging at the network edge. This is a clear departure from the
protocols are reported.
Index Terms—Multi-access Edge Computing, CoAP, Digital current literature [7] according to which DTs are hosted
Twin, OMA LwM2M, MQTT, MaaS in the remote cloud. Moreover, we align the deployment
to the European Telecommunications Standards Institute
I. I NTRODUCTION (ETSI) MEC reference architecture [8].
The proliferation of devices equipped with positioning ca- • We consider two different messaging protocols. In addition

pabilities has recently opened the way to the development of to CoAP [4], coupled with OMA LwM2M investigated in
a plethora of location-based applications. Tracking the user [3], we also leverage Message Queue Telemetry Transport
mobility can be helpful to identify the best transport solution (MQTT) [9].
to satisfy her needs, to infer her preferences and predict future • We assess the feasibility of the proposed framework

travel demands. Retrieving mobility data from fleets of Public through a realistic experimental validation conducted dur-
Transport (PT) vehicles can support the route planning and ing a trip taken by a bus in a urban environment. During
enable the design of improved solutions for customers. Both the trip, position information are retrieved through a
user and vehicle mobility data can feed Mobility as a Service Global Positioning System (GPS) receiver attached to a
(MaaS) applications tracking the mobility of commuters which Rasbperry Pi device [10].
need to dynamically compose their trips through solutions of • We quantitatively compare the two considered messaging

different travel operators, spanning different means of trans- protocols for what concerns the efficiency in terms of
portation, e.g., bike, car, bus, planes, trains [1], [2]. bandwidth usage, and the effectiveness in terms of packet
Besides positioning systems, Information and Communica- reliability.
tion Technologies (ICT) are needed to enable the collection, The rest of the paper is organized as follows. Section II pro-
delivery, processing, and presentation of the mobility-related vides background material about the enabling technologies and
information to all the interested parties. concepts of the proposed framework, which is then described
In our previous work [3] we identified prominent solu- in Section III. The experimental setup is described in Section
tions addressing most of the aforementioned functionalities IV, as well as the main findings of the evaluation study. Final
by encouraging the joint usage of Constrained Application remarks are reported in Section V.
Protocol (CoAP) [4], the lightweight messaging protocol specif-
ically devised for Internet of Things (IoT) environments, and II. M AIN TECHNOLOGY ENABLERS
the Open Mobile Alliance (OMA) Lightweight Machine-to- A. Multi-access edge computing: a primer
Machine (LwM2M) [5] resource description model. This results MEC has been proposed by ETSI as a prominent paradigm
into interoperable and efficient data retrieval and description. offering computing and storage resources at the edge of the
Moreover, there we argued in favour of Multi-access Edge mobile network, close to the subscribers. It provides high-
Computing (MEC) [6] facilities to host MaaS services by bandwidth and ultra-low latency access to radio network as
978-1-7281-7343-6/20/$31.00 © 2020 IEEE well as context information, which can be exploited by many

1
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

verticals, such as transportation, industrial automation, enter- transport operators, the municipality and the MaaS provider,
tainment, media, healthcare. can get mobility data of commuters/PT vehicles.
The reference ETSI architecture has been specified in [8]
with a set of Application Programming Interfaces (APIs) for A. The ground domain
key MEC interfaces, along with the main functionalities. We track the mobility of vehicles like buses, as well as of
The ME host is the entity that contains the ME platform commuters. The first ones are equipped with multi-interface
and a virtualization infrastructure which offers computing, OBSs, whereas commuters carry a smartphone, both acting as
storage, and network resources for the ME applications. Such User Equipment (UE). Each UE runs a UE App that interacts
applications can interact with the ME platform to consume and with the corresponding counterpart in the ME host, which is
offer ME services. Services offered by the platform are, for the DT App.
instance, the Radio Network Information Service (RNIS) and The OBS is equipped with a GPS transceiver and, similarly
the Location Service (LS), respectively providing radio-related to [13], with a Bluetooth Low Energy (BLE) beacon that
and augmented positioning information. broadcasts a series of identifiers. BLE on board the bus is
The ME orchestrator is the core of the ETSI MEC architec- leveraged to facilitate e-ticketing procedures [13].
ture. It decides which ME host(s) is(are) the most appropriate Information about the commuter position is retrieved through
one(s) for application instantiation (and relocation) according the GPS receiver of the smartphone. However, whenever the
to application demands (e.g., latency), monitored available UE App infers the smartphone to be on board the bus, the GPS
resources, and also mobility conditions. receiver is switched off to reduce the energy consumption of
The deployment of ME hosts in the edge domain is operator- the device. Hence, the smartphone offloads to the OBS the task
specific: an ME host can be associated either to each Base of updating the DT with its location data.
Station (BS) or to a set of them, covering either a small area Unlike in [3] where the smartphone of the commuter con-
or offering a urban coverage. nects to the on-board Wi-Fi network and is detected to be on
B. Digital Twins board, in this work, smartphones of passengers receiving BLE
identifiers infer to be on-board the bus.
Originally developed to improve manufacturing processes,
now digital twins have a wider scope by representing the digital As defined also in Android Location manager1 , in order to
replications of living as well as non-living entities [11]. Such affect battery lifetime as little as possible, the UE App on the
replications are designed as the semantic description of the smartphone will communicate with the corresponding DT App
sensed physical world [12], also incorporating contextual and only when necessary, i.e., whenever its position changes of at
sensor data from them. This makes the service discovery easier, least minDistance meters. The same workaround applies to the
since metadata is used to index the virtual devices, and the OBS. However, being the OBS powered, it is not concerned
introduced semantic description is able to cope with hetero- about battery consumption. Notwithstanding, for the OBS such
geneity to provide interoperability among physical devices at an approach has the advantage of reducing the amount of data
a virtual level. As a result, the digital twin enables data to be exchanged over the mobile network.
seamlessly transmitted between the physical and virtual worlds,
B. The edge domain
hence ensuring real-time monitoring of systems, helpful for
their maintenance and for future upgrades. DTs are deployed as virtualized applications at the edge. In
The DT concept can be also leveraged in the automotive particular, without loss of generality, we assume that they are
domain. Starting from collected mobility data, vehicle, bus and instantiated at the closest ME host. The selection of the most
truck manufacturers can devise solutions to improve customer’s proper ME host where the DT of a given commuter/PT vehicle
satisfaction and vehicle fleets management [7]. Today, the has to be placed is performed by the ME orchestrator and is
common practice is storing and updating the DTs of vehicles outside the scope of the present work.
and other physical entities in the remote cloud [7]. Each DT App interacts with the UE App to get mobility data.
Compared to the work in [7] which considers only vehicles as
III. T HE PROPOSED FRAMEWORK physical entities, we extend the concept of DT to commuters.
In this paper, we treasure the previously described concepts Data are locally stored at the edge and periodically trans-
and technologies and apply them to efficiently and effectively ferred to the remote cloud for long-term storage and analytics
track the mobility of commuters and of PT vehicles in order to purposes. Besides benefiting from low latency access to storage
collect valuable data for PT service monitoring and planning. and computing services close to where they are needed, DT
To this aim, we propose the framework graphically sketched applications hosted in MEC facilities can further access addi-
in Figure 1. We distinguish three different domains: (i) the tional (context) information and provide accurate positioning
ground domain, with commuters and PT vehicles which are via data fusion from multiple available sources, also improved
connected to the mobile network; (ii) the edge domain, hosting by the ME LS. Retrieved mobility data have typically a city-
the virtual counterparts of the physical entities, both for the wide scope and could benefit from local processing [3].
commuters and the PT vehicles; (iii) the remote applications
through which third parties, such as the road authorities, the 1 https://developer.android.com/reference/android/location/LocationManager

2
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 1. The proposed framework.

C. Interactions between physical devices and DTs The connectionless User Datagram Protocol (UDP) is lever-
aged as a transport protocol and therefore retransmissions are
In alignment with the existing literature [14], for the interac- managed at the application-layer. In particular, applications can
tion between physical devices and their corresponding DTs we send reliable (confirmable) or non-reliable (non-confirmable)
consider two of the most widespread IoT messaging protocols, CoAP messages. Confirmable messages are retransmitted until
as detailed in the following. acknowledged by the receiver or until a maximum number of
1) MQTT: The first option relies on MQTT, a lightweight retransmissions has been reached.
messaging protocol largely used in IoT contexts in presence In addition to the request/response approach typical of the
of small sensors and mobile devices, optimized for unreli- HyperText Transfer Protocol (HTTP), CoAP can also work in
able networks [9]. The MQTT protocol [9] leverages a pub- a publish/subscribe manner through the O BSERVE extension,
lish/subscribe approach, in which a client subscribes to a topic which enables efficient asynchronous monitoring of IoT re-
and receives notifications via a server whenever a new message sources. CoAP clients can send a request with an O BSERVE
is generated on that topic by the node acting as publisher. header option to a CoAP resource, described through a Uni-
An MQTT server plays the role of message broker between form Resource Identifier (URI). The CoAP server tracks such
publishers and subscribers. MQTT leverages the Transport subscriptions and sends a Notification message to the clients
Control Protocol (TCP) at the transport layer. In addition, it uses whenever the observed resource changes. In the reference sce-
three levels of message transmission reliability. With Quality nario of our study, the position of a commuter does not change
of Service (QoS)=0 (a.k.a. at most once delivery), messages when she is at the bus stop; the same holds for a PT vehicle
are simply sent once and are not acknowledged. With QoS=1 stopped at a red traffic light. Hence, the O BSERVE extension
(a.k.a. at least once delivery), acknowledgements are used and saves bandwidth resources and, consequently, battery of the
messages are retransmitted if no acknowledgement is received (mobile) device acting as server, compared to the GET/POST
before the expiration of a timeout. With QoS=2 (a.k.a. exactly primitives [3].
once delivery), a four-way handshake is used to ensure that a The OMA LwM2M protocol complements CoAP with a
message arrives exactly once. simple object-based resource model to facilitate interoperability
In our framework, the MQTT publisher is implemented in resource description and discovery. Its usage in vehicular
within the UE App. The DT hosts an MQTT subscriber which domains is also suggested in [3] and [15].
is interested in retrieving data published by the corresponding In our framework, we assume that a OMA LwM2M client
physical device (either the smartphone or the OBS). The MQTT is implemented within the UE App. It is in charge of retrieving
broker is deployed as an ME app. Several publishers and data from the GPS (of the smartphone, of the OBS) and sending
subscribers may exchange messages through the same broker. them through CoAP to the OMA LwM2M server, which is
2) CoAP and OMA LwM2M: As a second option we con- implemented in the ME host and which the DT has access to.
sider CoAP [4], a well-known protocol which allows IoT de- 3) Mobility data: For both messaging protocols options, the
vices to operate in a Web-like fashion. CoAP request/response same information is transmitted. More precisely, starting from
methods allow the interaction between a client, which in the the Location object (object id 6) defined in the OMA LwM2M
CoAP terminology refers to the node requesting data, and the Object and Resource Registry2 , only the essential mobility-
server, i.e., the node hosting the resource (e.g., the value of a
temperature, a location), which is typically an IoT device. 2 http://www.openmobilealliance.org/wp/omna/lwm2m/lwm2mregistry.html

3
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

related data are transmitted. Such fields are Latitude, Longitude, Adafruit Ultimate GPS Breakout receiver [16], a high-quality
Timestamp and Speed, as reported in Table I and requested and energy-efficient GPS module that can track up to 22
through a URI and a topic for the CoAP and OMA LwM2M satellites on 66 channels, with an excellent high-sensitivity
and the MQTT options, respectively. The meaning of such fields receiver, and a built-in antenna.
is reported in Table II. To collect position data to be sent to the DT, a trip of around
7 km has been performed in a urban environment, i.e., the city
TABLE I of Reggio Calabria, along the trajectory shown in Fig. 2. The
L OCATION DATA AND HOW THEY ARE REQUESTED BY THE TWO
trip includes bus stops, and correspondingly, the position is not
MESSAGING PROTOCOLS .
updated.
CoAP and OMA LwM2M MQTT Edge facilities hosting the DT App have been emulated
Information Resource OMA LwM2M MQTT Topic through an Asus laptop with CPU Intel Core i7-6500U, 12
ID Obj URI path
Latitude 0 /6/0/ ntf/raspberrypi/6/0/ GB RAM and 512 GB SSD.
Longitude 1 /6/0/ ntf/raspberrypi/6/0/ The connectivity between the OBS and the ME host is
Timestamp 5 /6/0/ ntf/raspberrypi/6/0/ emulated through a wired link and the tc Linux utility [17]
Speed 6 /6/0/ ntf/raspberrypi/6/0/
has been used to reproduce different packet loss settings over
such a link.
TABLE II 2) Software modules: For the case in which CoAP is se-
M EANING OF THE CONSIDERED Location FIELDS . lected as messaging protocol, our platform relies on Leshan3 ,
the implementation in Java provided by the Eclipse foundation
Information Description
Latitude The decimal notation of latitude. which allows to develop OMA LwM2M-compliant server and
Longitude The decimal notation of longitude. clients. Such implementation covers the majority of the OMA
Timestamp The timestamp of when the location measurement was LwM2M specifications4 . It is based on the Californium CoAP
performed.
Speed The time rate of change in position without regard for implementation.
direction: the scalar component of velocity. Measurements have been performed when the Non-
confirmable Message exchange and the Confirmable Message
exchange options are considered.
D. The remote applications
For what concerns MQTT, we rely on Mosquitto [18], which
Different stakeholders may be interested in the mobility is an open-source implementation of the message broker.
data of commuters and PT vehicles. In order to preserve
the security/privacy of potentially sensitive data related to B. Metrics
commuters and/or owned by transport operators, each DT may The following metrics have been measured:
expose such data to requesting authorized applications through
• Message delivery ratio: it is computed as the ratio between
properly configured views.
the number of messages successfully received at the DT
Traditional HTTP primitives can then be used by the remote
side and the number of messages generated by the UE
applications to query the DTs. Indeed, there is no need for
App during the trip.
lightweight protocols (like CoAP) as when, instead, interacting
• Byte overhead: it is derived as the overall number of
with resource-constrained physical devices.
bytes transmitted by the involved entities, i.e., Leshan
IV. P ERFORMANCE EVALUATION client/server or MQTT publisher/broker, over the num-
A. Experimental setup ber of actual bytes corresponding to the position data
generated by the UE App (i.e., 57 bytes). The metric
The objective of the evaluation study is twofold and specifi-
includes also TCP Acknowledgements in the case MQTT
cally it aims (i) to provide a Proof of Concept (PoC) of the
is considered.
proposed framework by leveraging off-the-shelf components
and (ii) to compare the considered messaging protocols in terms The metrics have been evaluated through the Wireshark5
of effectiveness and efficiency under different settings. protocol analyzer.
More in detail, the analysis focuses on capturing mobility Results averaged over 10 independent experimental runs are
data of a PT vehicle during a bus ride in a urban context. Hence reported.
the interactions between the OBS and the corresponding DT are C. Results
only considered. In so doing, thanks to the offloading policy
for the positioning tasks, the mobility of commuters on board Fig. 3 shows the message delivery ratio for CoAP with Non-
the bus can be also tracked. confirmable Message exchange, CoAP with the Confirmable
1) Hardware components: For the OBS implementation, we Message exchange, and MQTT QoS 0. Both MQTT and CoAP
leveraged a Raspberry Pi [10], an inexpensive, fully customiz- with the Confirmable Message exchange achieve full reliability.
able and programmable single board computer with support 3 https://www.eclipse.org/leshan/
for a large number of input/output peripherals and network 4 https://github.com/eclipse/leshan/wiki/LWM2M-Supported-features

communication interfaces. We attached to such a board the 5 https://www.wireshark.org/

4
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

exchange is significantly lower than MQTT. This is due to


the better formatting of OMA LwM2M w.r.t. MQTT. Indeed,
the size of transmitted packets carrying the same information
differs for the two protocols. We have 39 bytes for the payload
of the CoAP Notification using the Type-Length-Value (TLV)
format, and 58 bytes for the payload of MQTT Publish Message
transmitted in hexadecimal format. Both sizes can vary of 1
byte according to the transmitted position.
Besides the lower footprint in terms of exchanged bytes
due to the usage of a binary format, the TLV format foreseen
by OMA LwM2M has the additional advantage of facilitating
the interpretation of conveyed fields by different stakeholders,
hence natively ensuring higher interoperability.

Fig. 4. Byte overhead for the three compared messaging protocols.


Fig. 2. The trajectory of the trip.

This is possible in MQTT thanks to TCP-triggered retransmis- V. C ONCLUSION AND FUTURE WORKS
sions, whereas CoAP with the Confirmable Message exchange
emulates TCP acknowledgments at the application layer. If the In this paper we have presented a framework to track com-
unreliable option of CoAP is considered, the percentage of muters’ and PT vehicles’ mobility. The proposal builds upon
received messages equals the link-layer reliability settings. emerging IoT technologies for the collection, delivery, process-
ing and presentation of mobility-related data to serve MaaS
applications and services by other interested stakeholders. The
implementation of a realistic PoC confirms the viability of the
proposal and provides helpful insights about the effectiveness
and efficiency of the candidate messaging protocols for the
interactions of physical devices with the corresponding DTs.
Preliminary results about the computation footprint of the DT
application (not shown in the paper) showcase it is negligible.
Hence, as a future work we plan the deployment of the DT
application as a Docker container, as well as the evaluation
of its memory and CPU footprint when varying the available
processing resources and the number of commuters for a given
ME host to figure out potential scalability issues for the actual
deployment.
Fig. 3. Message delivery ratio for the three compared messaging protocols
under different packet loss settings. ACKNOWLEDGMENT

Fig. 4 reports the byte overhead metric. It can be observed This work has been partially supported by the “Mobility
that although providing the same reliability performance, the for Passengers as a Service” (MyPasS) project, funded by the
byte overhead incurred by CoAP with the Confirmable Message Italian Government (through the PON 2014-2020 initiative).

5
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

R EFERENCES
[1] G. Smith, J. Sochor, and I. M. Karlsson, “Mobility as a service: De-
velopment scenarios and implications for public transport,” Research in
Transportation Economics, vol. 69, pp. 592–599, 2018.
[2] A. Nikitas, I. Kougias, E. Alyavina, and E. Njoya Tchouamou, “How
can autonomous and connected vehicles, electromobility, brt, hyperloop,
shared use mobility and mobility-as-a-service shape transport futures for
the context of smart cities?” Urban Science, vol. 1, no. 4, p. 36, 2017.
[3] C. Campolo, D. Cuzzocrea, G. Genovese, A. Iera, and A. Molinaro, “An
OMA lightweight M2M-compliant MEC framework to track multi-modal
commuters for MaaS applications,” in 2019 IEEE/ACM 23rd International
Symposium on Distributed Simulation and Real Time Applications (DS-
RT). IEEE, 2019, pp. 1–8.
[4] C. Bormann, A. P. Castellani, and Z. Shelby, “CoAP: An application
protocol for billions of tiny internet nodes,” IEEE Internet Computing,
vol. 16, no. 2, pp. 62–67, 2012.
[5] “Open Mobile Alliance, Lightweight Machine to Machine Technical
Specification Core; v1 1-20180612-c,” 2018.
[6] Q.-V. Pham, F. Fang, V. N. Ha, M. Le, Z. Ding, L. B. Le, and
W.-J. Hwang, “A survey of multi-access edge computing in 5G and
beyond: Fundamentals, technology integration, and state-of-the-art,” arXiv
preprint arXiv:1906.08452, 2019.
[7] D. Person Pros and N. Carlsson, “Performance comparison of messaging
protocols and serialization formats for digital twins in IoV,” in IFIP
Networking, 2020.
[8] “ETSI GS MEC 003 v1.1.1. Mobile Edge Computing (MEC); Framework
and Reference Architecture,” March 2016.
[9] A. Banks and R. Gupta, “MQTT version 3.1. 1,” OASIS standard, vol. 29,
p. 89, 2014.
[10] “Raspberry pi, https://www.raspberrypi.org/.”
[11] A. El Saddik, “Digital twins: The convergence of multimedia technolo-
gies,” IEEE MultiMedia, vol. 25, no. 2, pp. 87–92, 2018.
[12] M. Nitti, V. Pilloni, G. Colistra, and L. Atzori, “The virtual object as a
major element of the internet of things: a survey,” IEEE Communications
Surveys & Tutorials, vol. 18, no. 2, pp. 1228–1240, 2015.
[13] G. Tuveri, M. Garau, E. Sottile, L. Pintor, M. Gravellu, L. Atzori, and
I. Meloni, “Automating ticket validation: A key strategy for fare clearing
and service planning,” in 2019 6th International Conference on Models
and Technologies for Intelligent Transportation Systems (MT-ITS). IEEE,
2019, pp. 1–10.
[14] Z. Laaroussi, R. Morabito, and T. Taleb, “Service provisioning in vehicu-
lar networks through edge and cloud: an empirical analysis,” in 2018 IEEE
Conference on Standards for Communications and Networking (CSCN),
pp. 1–6.
[15] S. K. Datta, J. Haerri, C. Bonnet, and R. F. Da Costa, “Vehicles as
connected resources: Opportunities and challenges for the future,” IEEE
Vehicular Technology Magazine, vol. 12, no. 2, pp. 26–35, 2017.
[16] Adafruit. Adafruit ultimate gps breakout. [Online]. Available:
http://www.adafruit.com/product/746description-anchor
[17] M. A. Brown, “Traffic control howto. [online]. Available:
http://www.tldp.org/howto/traffic-control-howto/,” 2017.
[18] “Mosquitto, MQTT open-source implementation, https://mosquitto.org/.”

6
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Session-level Adversary Intent-Driven Cyberattack


Simulator
Martin Drašar Stephen Moskal, Shanchieh Yang Pavol Zat’ko
Institute of Computer Science Department of Computer Engineering Faculty of Informatics
Masaryk University Rochester Institute of Technology Masaryk University
Brno, Czechia Rochester, NY, USA Brno, Czechia
drasar@ics.muni.cz {sfm5015,jay.yang}@rit.edu 456131@mail.muni.cz

Abstract—Recognizing the need for proactive analysis of cyber To address the lack of appropriate adversary-focused simu-
adversary behavior, this paper presents a new event-driven lation tools, this paper brings two main contributions:
simulation model and implementation to reveal the efforts needed
by attackers who have various entry points into a network. • Introduction of the concept and the implementation of a
Unlike previous models which focus on the impact of attackers’ new simulation model, enabling evaluation of adversary
actions on the defender’s infrastructure, this work focuses on behavior on the session level.
the attackers’ strategies and actions. By operating on a request- • Enabling integration of different attack models within one
response session level, our model provides an abstraction of
simulation engine, demonstrating its flexibility.
how the network infrastructure reacts to access credentials the
adversary might have obtained through a variety of strategies. This paper is structured as follows. Section II provides
We present the current capabilities of the simulator by showing a review of relevant state of the art. In Section III, we
three variants of Bronze Butler APT on a network with different introduce the proposed simulation model and describe its
user access levels.
Index Terms—DEVS, cybersecurity, adversary behavior, APT implementation and integration with different attack models.
Section IV presents our case study referencing to the Bronze
Butler APT (BB) and its implementation in the simulator
I. I NTRODUCTION engine. In Section V, we evaluate the simulator engine by
Historically, cybersecurity research on adversary behavior showing and reasoning different BB attack strategies using
was reactive rather then proactive. This is because most of random and learning attackers. We conclude the paper and
the automated attacks were built around a limited set of discuss future opportunities in Section VI.
vulnerabilities and followed a predefined tree of actions in a
fire-and-forget manner. Recognizing these actions and tracing II. S TATE OF THE A RT
them in system artifacts was therefore usually enough to either There exist several approaches to simulate the behavior
prevent the attacks or predict their evolution. Complex and of adversaries in networked systems. Some are designed
creative attacks were deemed a domain of trained human specifically to test intrusion detection systems (IDS), e.g., [2],
professionals and were analyzed only partially as their action- while others expertly define static attack scenarios with little
effect components. This changed, however, with the advent configurability [3]. Some use real networks (virtual machines)
of Advanced Persistent Threats (APT), typically reflecting [4], and the data is often tailored to a specific type of
malware and tactics used by state-sponsored actors. This type attack [5], [6]. In this section, we review three branches of
of malware, e.g., Stuxnet [1], is infamous for its destruction approaches modelling the interactions between adversaries and
of Iranian nuclear centrifuges and exhibits traits attributed to the networked systems, which are relevant to the adversary
human attackers and often favours stealth above else. Due simulation approach we introduce in this paper.
to relative rarity of such malware and limited observation
of its effects, reactive approaches are limited and effective A. Attack Graphs
defense needs proactive approaches to simulate and evaluate
The attack graphs were first proposed by Swiler et al. [7]
adversarial behavior. Existing works mostly focus on the
and are used to simulate steps an attacker can take within the
impact of attackers’ actions and not on the actions and attack
infrastructure. They describe an abstracted network topology
strategies themselves. This work addresses such limitations in
and show the nodes, paths and consequences of network
the current state of cyber attack simulation research.
attacks. Once an attack graph is constructed, it enables various
This research was supported by ERDF "CyberSecurity, CyberCrime tasks of network security analysis. The use of attack graphs is a
and Critical Information Infrastructures Center of Excellence" (No.
CZ.02.1.01/0.0/0.0/16_019/0000822), and by US NSF Award # 1742789.
widely researched topic including attack graph generation [8],
[9], application scenarios [10], [11] and analytic methods [12].
To support automatic graph generation, tools such as MulVAL,
978-1-7281-7343-6/20/$31.00 ©2020 IEEE NetSPA, or TVA were developed, as summarized in [13].

7
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

The downside to attack graphs is that they require the synthetic network attack emulation and evaluation, dynamic
totality of knowledge about the target infrastructure and known selection of attack models, and integration with non-simulated
vulnerabilities. While they model possible attacker actions, IDS systems. This model thus occupies a space between
they are in effect centered on the defense and represent a various simulation models and tools described earlier.
vulnerability model rather than an attack model. They offer The simulator implementing the model as well as the
only limited options to analyze adversarial behavior. evaluation scripts can be freely downloaded from here:
https://muni.cz/go/565e43
B. Game Theoretic Approaches
Game theoretic approaches applied to cyber security are A. Model goals
well researched and traditionally involve an attacker-defender This work aims at developing a cyberattack simulator that
model where the defender optimizes their defensive strategy models the interactions between the progression in adversary
for risk minimization [14]–[17] or maximize the uptime of intended outcomes and the network session level responses.
network assets [18]–[20]. The games played rely on some Here, the session level means that the units of interaction
amount of information sharing of various amount (complete between adversaries and the attacked environment are requests
or incomplete) where typically the defender observes the and responses, i.e., rough equivalent of TCP sessions. Such
attacker and responds according to their objective function interactions are meant to maximize autonomy for both the
[21]. Many works aimed at the attacker are either focused attackers and defenders. The model described in this paper is
on a specific attack type like distributed denial of service a step towards the longer-term and broader goals to enable:
(DDoS), which abuses a large number of machines to disable • lightweight simulation of multi-agent cybersecurity sce-
target service by overloading it with requests [22], or relies on narios,
unspecified mission models [23]. Liang et al. mentions that the • integration of different attack models,
attacker-defender model specifically for impact assessments • non-stochastic simulation of interaction between attackers
requires extensive data to understand the dynamic relationships and defenders for in-depth analysis of attack strategies,
between the attacker and defender, creating complex models • rapid prototyping of attack and defense strategies,
that may or may not have a solution [21]. • *smooth transition of simulated actors into emulated and
C. Simulation Approaches real-world settings,
• *modelling of environments, which can be emulated in
Similar to game theory, the impacts of attacks and attackers
can be realized through the use of configurable cyber attack virtual environments using already provided data,
• *integration of simulation and emulation to remove the
simulation platforms. NeSSi2, an agent-based simulation plat-
form by Grunewald et al. [24], models a packet-level descrip- need to re-implement existing cyberdefense mechanisms.
tion of a network with the primary focus on simulating the Note that the last three goals are outside of the scope of
effects of DDoS attacks. NeSSi2 models the effects of various this paper; yet they influence the current model design and
worm behaviors and how worms propagate through a network. development. Section VI briefly describes the relevant projects
This technique proves to be useful in other contexts such as and activities beyond this paper linking to the long-term goals.
smart grid networks [25]. Moskal et al. [26], [27] presents a B. Model components
knowledge-based cyber attack simulator CASCADES, where
the attacker’s actions are determined by the “Attacker Behavior The simulation model adopts the message-based approach
Model" (ABM) and the knowledge obtained about the target and consist of a number of components, which can be divided
network through performing actions on the network. CAS- into four levels: environment, network, host, and logical.
CADES focuses on a Monte-Carlo style approach to attack 1) Environment level: The environment level is a top-most
simulation and generates 1000’s of plausible attack scenarios layer of components, which are used for orchestration of
given the ABM and a detailed network description known as particular scenario runs. There are two components present:
the Virtual Terrain (VT) message and environment.
Several other simulation platforms exist and base mostly on Message is a unit of information exchanged between actors
a discrete event formalism, e.g., Chi et al. [28], Liljenstam et in the simulation. The message carries routing and statistical
al. [29], Futoransky et al. [30], and Kuhl et al. [31]. These information, activity descriptions and actors’ responses.
platforms offer synthetic network and attack emulation and Environment keeps track of all simulation elements, man-
evaluation. To overcome limited realism, emulators using vir- ages interaction between these elements by passing messages,
tual machines and integrated with offensive tools are available, controls the simulation time and evaluates an impact of actors’
such as DCAFE [32] and SVED [33]. It is also worth noting activities. It is the only point of interaction between actors and
that others have experimented cybersecurity simulations based the simulation.
on general-purpose simulators such as OMNet++ [34]. 2) Network level: The network level represents the topol-
ogy of simulation. The components used mimic the compo-
III. S IMULATION MODEL nents of the network, with some simplifications enabled by the
In this section we present a new non-stochastic simulation conceptual level the simulation happens on. The components
model based on discrete event formalism, which enables are: nodes, firewalls, connections, routers, and sessions.

8
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Nodes represent physical or virtual machines. Each node is Authorizations encode the ability of actors to access partic-
accessible from the outside via a set of network ports, which ular services or data. They can be defined in the scenario
are a simplification of an Ethernet port, i.e., these ports have an configuration or they can be created as a result of actors’
IP address and can be uniquely identified (although no explicit activities, e.g., new authorization resulting from successful
MAC addressing is used). privilege escalation. Terminology-wise, they conflate both au-
Firewalls function as their real-world counterparts by con- thorization and authentication for the sake of simplicity.
trolling inbound and outbound messages. They implement an Exploits represent mechanisms to abuse vulnerabilities of
equivalent of a simplified filter table of iptables with source- particular services and are tied to the name and the version of a
destination filtering and default filtering policies. service. To enable better machine reasoning about exploits and
Connections represent links between ports of particular their effects, they are categorized by their effect and locality
nodes. Messages go through the connections and can be and allow only limited parametrization to create a bounded
affected by connection properties. exploit domain from which an attacker can choose. The
Routers partition the networks. Unlike the real network exploits are expected to map to real-life exploits, such as those
settings, they are the only active switching elements. Routers listed in services like National Vulnerability Database [35] or
control permeability between different networks and enable Common Vulnerabilities and Exposures [36].
fine-grained control depending on both sources and destina- Actions represent the type of activity of actors. They can
tions of messages. comprise anything from getting the simulation time to launch-
Sessions represents a set of connections going through the ing a DDoS attack. They are a mean to express an attack
network, which are not subject to routing policies in the model within a simulator. In this case, we understand the attack
intermediate routers. An example of a session is a VPN tunnel model as an abstraction of activities an attacker can perform.
or a tunnel through several layers of NAT. The name stems The actions are then the elements of the actions space defined
from attacker taxonomy, where attacker exploits weaknesses in by a particular model. One such model, which we used for
infrastructure to open sessions to or from their targets, which our simulation evaluation is presented in the following text.
would be otherwise prohibited by intermediate active network
elements. C. Attack model
3) Host level: The host level covers activities happening Defining the action space of the adversary is a particularly
at a node. In addition to network ports, a node is modelled challenging task for cyber-attack simulators as the action space
as a set of services representing running processes. A node is effectively infinite, constantly expanding, and extremely di-
does not define an OS, as this is instead expressed as a set verse in the types of actions that can be performed. Modelling
of services representing OS functions required for simulation. each vulnerability in the simulator is time consuming and
Services come in two variants, which are the components of unsustainable, so we choose to represent the action space as
the host level: active services and passive services. an abstraction of the objective or intent of an attacker given
Active services can initiate the communication with their the simulated attack stage of the attacker. The Action-Intent
surroundings by sending messages through the environment. Framework (AIF) [37] is a cyber-attack action classification
They also processes inbound messages and react according framework where the focus is to describe attack actions with
to their programmed behavior. Thus, they must understand respect to the intended objective of performing a specific
the semantics of the incoming messages. Typical example of action such as: information discovery, privilege escalation,
active services are attackers and defenders, i.e., actors whose data exfiltration, etc. The AIF differentiates itself from other
behavior is the focus of a simulation. attack descriptions by providing significantly more detail then
Passive services, on the other hand, do not initiate a typical Cyber Attack Kill Chains while finding a middle
communication. Inbound messages are instead evaluated by ground between the highly detailed MITRE ATT&CK® [38]
the environment based on the definition of a passive service. by remaining network and service agnostic.
The definition contains service name, version, ability to create The AIF is broken up into two layers of abstraction: the
sessions, locality, etc. Passive services are used to create Macro Action-Intent States (Macro-AIS) describe the effect of
a believable environment for the active services, while not the actions at a high-level such as reconnaissance or destroy
pushing the burden of implementation on the user of the information, whereas the Micro Action-Intent States (Micro-
simulator. AIS) describe the method used to achieve the corresponding
4) Logical level: The logical level represents the activity Macro-AIS. An example is a brute-force credential access
domain for active actors, and relations between scenario ele- Micro-AIS for the privilege escalation Macro-AIS. We use
ments and scenario goals. There are four components: data, the Micro-AIS to represent our desired simulated attack sce-
authorizations, exploits, and actions. narios and as a method to select simulated actions given the
Data represent units of information, which may be interest- network topology and the services running on the network.
ing to an attacker, such as trade secrets or employee records. Given the session-based approach of our proposed simulation
Obtaining data does not help an attacker with the actual attack, architecture the abstracted model of the attacker’s process
but they may be essential to reaching the goal of the given will allow for attack scenarios to be quickly created and then
scenario. applied to the network topology without the need for detailed

9
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Environment Message flow A. Bronze Butler in Micro-AIS


Node Router Node Bronze Butler is a well documented Chinese hacker group
Service Firewall Service infamous for targeting Japanese critical infrastructures be-

Firewall

Firewall
Data
Routing table
Data tween 2012–2017. Bronze Butler has been reported to use a va-
Authorization 1 Authorization 3
... ...
riety of spearphishing techniques, remote access exploits, and
Service Authorizations Exploits Service
web-based zero-day malware to target high profile executives
Data Authorization 1 Exploit Data to obtain sensitive business strategies and sales information.
Authorization 2
...
Authorization 2 Exploit Authorization 4
...
Depending on the target, Bronze Butler employed two tech-
Authorization 3 Exploit
... ... ... ... niques to gain initial access to their target: 1) a spearphishing
email to an executive with a malicious attachment [40] or
exploited VPN services to gain access to the target network
Fig. 1. Diagram of model components
[41]. The end goal of Bronze Butler is to exfiltrate critical
business or user information through the use of file-share
servers.
exploit definitions. In Section IV we demonstrate how a known
We choose to use Bronze Butler for our case study as
description of a real cyber-attack can be described using the
the techniques employed by Bronze Butler are sufficiently
AIF and then we use that description as the driving force of
complex to demonstrate the capabilities of our simulation
our simulation engine.
engine exhibiting distinct behaviors that are well represented
in attack action descriptions such as MITRE ATT&CK and
D. Integration of Components and Simulation Execution
the AIF.
Fig. 1 illustrates the component relations and how mes- Bronze Butler is comprised of a team of highly skilled
sages traverse between components. When a simulation run attackers. However, we abstract the behaviors of Bronze Butler
begins, all active services are executed. The services produce as a single entity and represent the behaviors as a set of
messages, which are inserted into environment queues and Micro-AIS to represent their scenario in our simulation engine.
distributed on a hop-by-hop basis to their intended targets. Using the threat reports from SecureWorks [41] and technique
Thus the message transport mimics the packet transport over description from MITRE ATT&CK [42], we map attack action
a network. The messages trigger component responses, simu- evidences to a corresponding Micro-AIS to capture some of
lating the actions and responses when the network is attacked. the key behavioral properties of Bronze Butler that will be
A message passing a connection can arrive into a router, used as the basis of our simulation experiments.
active service, or passive service. Each component computes a The Table I summarizes Bronze Butler capabilities in terms
simulation-relative processing time, which models link delays of MITRE ATT&CK, the AIF, and the simulation engine.
and processing complexity. Arriving into a router the message The table demonstrates that the simulator using Micro-AIS as
can either be forwarded or dropped based on firewall and an attack model is able to simulate most of Bronze Butler
routing rules. If the message arrives in a passive service, it is behavior, with the exception of user interaction and host-
evaluated by the environment, which has an implementation of level interaction, by means of simulated actions and their
attack models’ semantics (in our case the AIF). The model’s parameters.
implementation decides on the response given the action (in
B. Network Topology and Access Control
our case a Micro-AIS), message, and passive service parame-
ters. If the message arrives into an active service, the service To emulate the various scenarios how Bronze Butler can
decides on the response based on the observable properties penetrate into a network, we prepare a small-scale network
of the message. Note that active services do not have access as depicted in Fig. 2. The topology is partitioned into four
to action description as it would be equivalent to knowing logical segments, separated by routers with firewalls. The
an attacker’s intent just by looking at the packet and would first segment is outside of the organization and represent the
bypass the hard problem of cybersecurity analysis. attacker. Note that Bronze Butler may compromise the organi-
zation’s partner in a different network domain. For simplicity,
IV. C ASE STUDY: B RONZE B UTLER APT we consider all external sources in the same network. The
second segment is the DMZ with a Web server, VPN server
To present the expressive power of the simulation engine and an Email server. Each server has two network interfaces,
and to show how it can be used to reason about attackers’ and one accessible from the outside and the other accessible from
defenders’ abilities, we consider various scenarios simulating the inside. The third segment contains the desktop machines of
Bronze Butler APT breaching into an organization with the an employee and a CTO. Machines in this segment can access
goal of data theft. We model the Bronze Butler APT with the DMZ and can only be accessed from the SRV segment. The
the Micro-AIS described in Sec. III-C and compare it to fourth is the SRV segment containing an API gateway to the
the MITRE ATT&CK framework [39]. We then present the organization’s services, a database server (DB), and a domain
common network topology and an attack graph modelling the controller (DC). This segment is accessible from the DMZ
three variants of the breach scenario, which we further discuss. and from the CTO’s PC. The API gateway is also accessible

10
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)
ATT&CK Technique Technique example Micro AIS
T1087 used net user /domain to identify account information. information discovery
T1088 malware xxmm contains a UAC bypass tool for privilege escalation. user privilege escalation
T1003 used various tools to perform credential dumping. information discovery
T1005 exfiltrated files stolen from local systems. data exfiltration
T1039 exfiltrated files stolen from file shares. data exfiltration
Simulated actions

T1140 downloads encoded payloads and decodes them on the victim. lateral movement
T1083 collected a list of files from the victim and uploaded it to its C2 server, and then created a new list of specific files to steal. data exfiltration
T1107 uses command to delete the RAR archives after they have been exfiltrated. data destruction
T1097 created forged Kerberos Ticket Granting Ticket (TGT) and Ticket Granting Service (TGS) tickets to maintain administrative access. root privilege escalation
T1060 used a batch script that adds a Registry Run key to establish malware persistence. lateral movement
T1105 used various tools to download files, including DGet (a similar tool to wget). lateral movement
T1018 use ping and Net to enumerate systems. host discovery
T1053 used at and schtasks to register a scheduled task to execute malware during lateral movement. lateral movement
T1102 MSGET downloader uses a dead drop resolver to access malicious payloads. lateral movement
T1113 used a tool to capture screenshots. information discovery
T1124 used net time to check the local time on a target system. -
T1210 used a CVE-2016-7836 to exploit VPN connection command and control
T1024 used a tool called RarStar that encodes data with a custom XOR algorithm when posting it to a C2 server. -
Action parametrization

T1002 compressed data into password-protected RAR archives prior to exfiltration. -


T1059 uses the command-line interface. -
T1132 encode data with base64 when posting it to a C2 server. -
T1022 compressed and encrypted data into password-protected RAR archives prior to exfiltration. -
T1086 used PowerShell for execution. -
T1064 used VBS, VBE, and batch scripts for execution. -
T1071 used HTTP for C2. -
T1032 used RC4 encryption (for Datper malware) and AES (for xxmm malware) to obfuscate HTTP traffic. -
T1189 compromised three Japanese websites using a Flash exploit to perform watering hole attacks. -
interaction

T1193 used spearphishing emails with malicious Microsoft Word attachments to infect victims. -
User

T1203 exploited Microsoft Word vulnerability CVE-2014-4114 for execution. -


T1204 attempted to get users to launch malicious Microsoft Word attachments delivered via spearphishing emails. -
T1009 included "0" characters at the end of the file to inflate the file size in a likely attempt to evade anti-virus detection. -
N/A

T1036 given malware the same name as an existing file on the file share server to cause users to unwittingly launch and install the malware on additional systems. -

TABLE I
B RONZE B UTLER APT CAPABILITIES IN TERMS OF MITRE ATT&CK FRAMEWORK , THE AIF, AND THE SIMULATION ENGINE .

via a VPN tunnel from the outside. The simulated machines sent a successful attack on the CTO of the organization, who
are populated with various services and access credentials to can access the DB server and also has the necessary credentials
enable the attacker multiple paths through the network. to extract the goal data from the database. The second one
To limit the scope of the demonstration and to better reason represents a successful attack on the partner, who has an API
about the simulating engine, the modelled network does not access via VPN into the SRV segment. The attacker has to
contain any active defenses. The defenses are passive and use the API server as a stepping stone to get access to the
based on network and host access control. While the resulting domain controller and follow by forging a golden Kerberos
configuration cannot be considered secure, it will be shown ticket, which is then abused to get access to the data. The
later that it is hardened enough to resist inept attack attempts. most complicated variant begins with a successful attack on
an employee. The attacker has to go through the Web server
C. Attack Graph and Scenario Variations
in DMZ, discover domain controller credentials there and con-
The combination of the network configuration, the deployed tinue the attack inside the SRV segment as in the previous case.
services, and the access credentials on the simulated hosts, Those three variants require the attacker to execute 2(3), 6(7),
gives the Bronze Butler multiple ways to achieve the ulti- and 7(9) appropriate actions respectively to achieve the goal.
mate goal of exfiltrating data from the DB server. We have The numbers above represent the minimal number of steps
manually crafted an attack graph from the total knowledge of in the attack graph from the given start to the terminal node.
the scenario. The attack graph covering all shortest paths is The numbers in parentheses also include the reconnaissance
depicted in the Fig. 3. To preserve clarity of the graph, the steps, which would be necessary in a real-world setting. Note
paths that do not lead to the goal are excluded; these paths, that each step could actually be one or many activities in
however, can be explored by the attacker in the simulation simulation, especially in case of reconnaissance.
and can greatly prolong the attack duration. This can lead to a
counter-intuitive behavior of attacker with elevated privileges, V. E VALUATION AND DISCUSSION
which is later discussed in the Section V-C. In this section, we introduce three different implementations
Despite the possible variations, each path in the attack of Bronze Butler, each representing a different attack strategy;
graph starts with a successful spearphishing attempt, because namely a scripted, a random, and a learning attackers. We
it is the predominant entry-point for the Bronze Butler APT. deploy these attackers into the simulator and let them attempt
Note that the current simulation concentrates on modeling the the three scenario variants. By analyzing the results and
interactions between the attacker and the network, and does not extracting insights into the attack strategies, we demonstrate
include the exact user interactions with the attacker phishing how the simulator can be used to reason about particular attack
emails, for example. For each scenario variant the attacker strategies and about attackers’ behavior.
begins with the simulated artifact of the phishing attempt - an
opened session to the target machine. A. Scripted attacker
There are three main variants of the scenario, which differ The scripted attacker represents the idealist situation and
by the successfully spearfished machine. The first one repre- follows the shortest path in the attack graph from each of the

11
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

DMZ SRV

Attacker Web API DB DC

VPN

Partner
Mail

PC Employee CTO

Fig. 2. Scenario network topology

three starting points, as depicted in Fig. 3. This attacker type the so called normalized attack difficulty, i.e., an average
is implemented as omniscient, i.e. knowing the topology of number of attack actions between advancing to a next step
the infrastructure and all system weaknesses, so it does not in the attack graph. This will enable a comparison of efforts
need to perform reconnaissance tasks. Therefore, its number and thus difficulty for the attacker to achieve the ultimate goal
of actions is the lower bound on actions needed to finish each in each of the CTO, VPN, and Employee cases, respectively,
scenario variation. For the three scenario variants (CTO, VPN, when the idealistic assumption is lifted.
and Employee), the abstract actions are:
B. Random attacker
• CTO: Acquire CTO credentials and Exfiltrate data from
the DB server (2 total). The random attacker, as the name implies, selects random
• VPN: Access infrastructure via VPN exploit, Acquire DC actions from the entirety of the action space until the goal
credentials, Establish session to the DC, Get root access is reached or the number of actions in a run exceeds a
to the DC, Get the golden ticket, Exfiltrate data from the given threshold. The network being considered seem to be
DB server (6 total). small scale but there are a significant number of sessions
• Employee: Acquire Employee credentials, Establish ses- and accesses for each attack step. While not resembling real-
sion to the Web server, Acquire DC credentials, Establish world attackers, the random attacker provides an important
session to the DC, Get root access to the DC, Get the benchmark to evaluate:
golden ticket, Exfiltrate data from the DB server (7 total). • correctness of the system behavior through fuzzing of

The total counts shown above are the minimal steps for simulation environment.
each variant and the attacker cannot take a shorter path due • complexity and effect of different scenarios.

to the lack of access or authorization. It is not surprising the • efficiency of different attack strategies.

CTO case presents much shorter path than the other two in this Fig. 4 illustrates how the random attacker is used to evaluate
idealist setting. In reality, however, getting the CTO credential the complexity of the action space and the impact of different
might be harder to achieve than getting such from the large strategies applied to the underlying CTO scenario variant. It
population of employees, especially if the CTO is well versed shows the number of actions needed to reach the goal over
in cybersecurity hygiene. 100 runs. Note that the CTO scenario requires a sequence of
Note that the minimal step counts will be used to calculate only two correct actions to reach the goal. Yet it can take

PC:Employee session Employee credentials SRV:Web session


established extracted established

SRV:DC
SRV:DC credentials SRV:DC session Golden ticket
Administrator
extracted established generated
privileges obtained

Spearphishing Partner session SRV:API session


campaign established established

PC:CTO session CTO credentials SRV:DB Data


established extracted extracted

Fig. 3. Scenario attack graph

12
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

a significant number of actions to reach the goal due to the reward values for successful and unsuccessful activities, it is
large action space the CTO has access to. We consider the able to compute the upper confidence bound and choose the
following random attacker strategies (rules followed by the next action accordingly. To prevent nonsensical ordering of
random attacker) to reduce the action space: actions, such as data deletion before extraction, it uses fixed
• Known services: the attacker targeted only services it action priorities. Essentially, the learning attacker act more
knew were running on particular hosts. intelligently by selecting the viable targets and sessions. This
• Live machines: the attacker did not try again a combi- serves as a step closer to mimic real-world attacks and is used
nation of session and a target if it received a network to compare to the random attacker on the actions needed to
failure. reach the goal when interacting with the network components.
We formulate the following two hypotheses:
150000 • On average, the learning attacker will require consider-
100000
ably less actions to finish the scenario than a random
50000 attacker employing the live machines strategy.
1
• The normalized attack difficulty (NAD) of the learning
attacker will decrease over time, whereas for random
10000
attacker it will remain constant.
The first hypothesis is based on the learning attacker’s
5000
ability to gradually add possible targets, rather than removing
them from the entire target space as the random attacker em-
ploying live machine strategy does. The second hypothesis is
1000
No reduction Known services Live machines Live machines + based on the random attacker not understanding any relations
strategy Known services
between actions and their consequences and selecting actions
by chance, whereas the learning attacker learns the appropriate
Fig. 4. Number of actions needed to reach the goal when simulating
actions over time.
different random CTO attacker strategies and their combinations. Each box- Figures 5 and 6 show the raw and normalized number of
plot represents 100 runs with up to 150,000 actions. actions required to finish each of the scenario variants for the
random and the learning attackers. For both attackers, each
Fig. 4 provides insights to the impact of particular strategies scenario variant was run 1000 times. It is apparent that the
as well as to the usage of randomized attackers to test simu- first hypothesis holds and even a cursory glance on the graphs
lated cyberattack scenarios. To begin with, having a random shows that the learning attacker’s strategy is between one and
attacker utilize the entire action space without any effort to two orders of magnitude more efficient.
reduce it is pointless. The current scenario variant had at least
1.5 million possible actions which could double with each 150000

successful session-establishing action not leading to the goal. 100000

Therefore, it is not surprising that the majority of runs ended 50000

by reaching the threshold and failing the goal. Focusing on


known services does have an impact which is proportional to
the average number of services on each host and the totality 10000

of services in the scenario. Focusing only on live hosts has the


5000
biggest impact and is the factual prerequisite to running more
complex scenarios. Finally, compounding the two strategies
reduces the action space even further.
The general insight is that randomized attackers are a viable 1000

concept for evaluation and benchmarking simulations, but they 500


CTO CTO VPN VPN Employee Employee
require aggressive tactics to reduce the possible action space, Random Random Random Random Random Random
Normalized Normalized Normalized
especially for more complex scenarios with a lot of different
host, services, exploits, and authorization mechanisms. For
Fig. 5. Number of actions for random attacker to finish under each scenario.
the benchmark in the next section, we opted-in for the live-
hosts strategy as it ensures steepest reduction in activity with
The second hypothesis also holds, but gives additional
minimal interference with random selection process.
insights based on two observations from the plots. The first
C. Learning attacker can be seen in Fig. 5, where the median NAD for the CTO
The learning attacker treats the simulated scenarios as a variant is approximately 1/2 of the other two variants. The
multi-armed bandit problem and employs the UCB1 [43] al- reason is that those two variants require the attacker to acquire
gorithm to reach the scenario goal. The attacker keeps track of 1 NAD is the simulated attack actions count normalized by the minimal
the uses of targets, actions, sessions, etc. By assigning specific attack steps under each of the CTO, VPN, and Employee variant.

13
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

1000
support for volumetric attacks, such as DDoS, adding
a framework for inter-attacker communication and es-
500 pecially support for stealthy and distributed operations,
and automating integration with exploit databases, such
as NVD.
• Actor autonomy: adding a support for automated gen-
100
eration of realistic cybersecurity scenarios. Currently, the
50
approach to use the presented simulation model as a basis
for expressing the scenarios as a satisfiability problem is
explored and results should be available soon.
• Defender support: adding support for methods of active
10 defense, such as firewall manipulation, decoy services,
CTO CTO VPN VPN Employee Employee
Learning Learning
Normalized
Learning Learning
Normalized
Learning Learning
Normalized
etc. to transition from passive to active defense and to
enable feedback loop between attackers and defenders.
• Deployability: enabling the transition from simulation
Fig. 6. Number of actions for learning attacker to finish under each scenario.
environment to emulated and real-world environments,
such as KYPO [45], while maintaining 1:1 mapping in
a new session via exploitation and this new session doubles attacker capabilities by using the agent algorithms to drive
the attacker’s action space. The second observation can be real-world attacking platforms, such as Cryton [46].
seen in Fig. 6. Event though the median NAD decreases over • User experience: creation of an IDE to facilitate easier
time as expected, the CTO variant displays unexpectedly large creation and analysis of cybersecurity scenarios.
variation and in many cases the easier scenario took longer
to finish than the other more complex variants. This counter- R EFERENCES
intuitive behavior is rooted in that the attacker under the CTO [1] R. Langner, “Stuxnet: Dissecting a cyberwarfare weapon,” IEEE Security
variant has visibility to the entirety of the infrastructure and & Privacy, vol. 9, no. 3, pp. 49–51, 2011.
is free to explore unfruitful branches of the attack graph. This [2] F. Erlacher and F. Dressler, “How to test an ids?: Genesids: An
automated system for generating attack traffic,” in Proceedings of the
has led to more possibilities and thus large variations where 2018 Workshop on Traffic Measurements for Cybersecurity. ACM,
the CTO variant can have very small or very high NAD’s 2018, pp. 46–51.
comparing to the other two variants, which are constrained in [3] M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, “Towards
generating real-life datasets for network intrusion detection.” IJ Network
what the attackers have access to. Security, vol. 17, no. 6, pp. 683–701, 2015.
[4] I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating
VI. S UMMARY AND F UTURE W ORK a new intrusion detection dataset and intrusion traffic characterization.”
This paper introduced a new event driven simulation tailored in Proceedings of the International Conference on Information Systems
Security and Privacy, 2018, pp. 108–116.
to analyzing and evaluating adversarial behavior. The model [5] S. Alzahrani and L. Hong, “Generation of ddos attack dataset for effec-
fills a gap between different cybersecurity simulation works tive ids development and evaluation,” Journal of Information Security,
and tools by focusing on attackers’ intent and actions, by vol. 9, no. 04, p. 225, 2018.
[6] M. Cermak, T. Jirsik, P. Velan, J. Komarkova, S. Spacek, M. Drasar, and
enabling integration of different attack models, and by oper- T. Plesnik, “Towards provable network traffic measurement and analysis
ating on a session level. The model and its implementation via semi-labeled trace datasets,” in Proceedings of 2018 Network Traffic
were evaluated with three variants of Bronze Butler APT Measurement and Analysis Conference (TMA). IEEE, 2018, pp. 1–8.
[7] C. Phillips and L. P. Swiler, “A graph-based system for network-
(BB). These variants of attacking agents possess the BB’s vulnerability analysis,” in Proceedings of the 1998 workshop on New
capabilities and were launched against a simulated corporate security paradigms, 1998, pp. 71–79.
infrastructure with insecure configuration. We simulated ran- [8] K. Kaynar, “A taxonomy for attack graph generation and usage in
network security,” Journal of Information Security and Applications,
dom and learning attackers for each of the three variants, vol. 29, pp. 27–56, 2016.
and assessed the efforts needed in each case to complete the [9] X. Ou and A. Singhal, “Attack graph techniques,” in Quantitative
attack goal. Our results showed not only insights on how Security Risk Assessment of Enterprise Networks. Springer, 2012, pp.
5–8.
to realize session level cyber adversary simulation, but also [10] Z. Ye, Y. Guo, C. Wang, and A. Ju, “Survey on application of attack
how different levels of accesses (CTO, VPN, and Employee) graph technology,” Journal of Communications, vol. 38, no. 11, pp. 121–
can lead to orders of magnitude differences in the number of 132, 2017.
[11] V. Shandilya, C. B. Simmons, and S. Shiva, “Use of attack graphs in
actions needed to retrieve critical data. security systems,” Journal of Computer Networks and Communications,
The presented simulation engine is the first stage in a co- vol. 2014, 2014.
ordinated effort to create an infrastructure for development of [12] J. Zeng, S. Wu, Y. Chen, R. Zeng, and C. Wu, “Survey of attack
graph analysis methods from the perspective of data and knowledge
autonomous cybersecurity agents, spearheaded by the NATO processing,” Security and Communication Networks, vol. 2019, pp. 1–
IST-152 research group [44], and as such will be expanded in 16, 12 2019.
several key areas: [13] S. Yi, Y. Peng, Q. Xiong, T. Wang, Z. Dai, H. Gao, J. Xu, J. Wang,
and L. Xu, “Overview on attack graph generation and visualization
• Action space: implementing the rest of the AIF which technology,” in 2013 International Conference on Anti-Counterfeiting,
can be expressed within the simulation model, adding Security and Identification (ASID), 2013, pp. 1–6.

14
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

[14] C. Xiaolin, T. Xiaobin, Z. Yong, and X. Hongsheng, “A markov game [34] A. Varga, OMNeT++. Berlin, Heidelberg: Springer Berlin Heidelberg,
theory-based risk assessment model for network information system,” 2010, pp. 35–59. [Online]. Available: https://doi.org/10.1007/978-3-
in Proceedings of International Conference on Computer Science and 642-12331-3_3
Software Engineering, vol. 3. IEEE, 2008, pp. 1057–1061. [35] National Institute of Standards and Technology. NVD - General FAQs.
[15] K. C. Nguyen, T. Alpcan, and T. Basar, “Security games with incomplete [Online]. Available: https://nvd.nist.gov/general/FAQ-Sections/General-
information,” in Proceedings of IEEE International Conference on FAQs
Communications. IEEE, 2009, pp. 1–6. [36] The MITRE Corporation. About CVE. [Online]. Available:
[16] B. Wang, J. Cai, S. Zhang, and J. Li, “A network security assessment https://cve.mitre.org/about/index.html
model based on attack-defense game theory,” in Proceedings of 2010 In- [37] S. Moskal and S. J. Yang, “Cyberattack action-intent-framework for
ternational Conference on Computer Application and System Modeling mapping intrusion observables,” 2020.
(ICCASM), vol. 3. IEEE, 2010, pp. V3–639. [38] The MITRE Corporation. MITRE ATT&CK. [Online]. Available:
[17] K. Chung, C. A. Kamhoua, K. A. Kwiat, Z. T. Kalbarczyk, and R. K. https://attack.mitre.org/
Iyer, “Game theory with learning for cyber security monitoring,” in [39] B. E. Strom, J. A. Battaglia, M. S. Kemmerer, W. Kupersanin, D. P.
Proceedings of 2016 IEEE 17th International Symposium on High Miller, C. Wampler, S. M. Whitley, and R. D. Wolf. (2017) Finding
Assurance Systems Engineering (HASE). IEEE, 2016, pp. 1–8. cyber threats with ATT&CK-based analytics.
[40] J. DiMaggio, “Tick cyberespionage group zeros in on japan,”
[18] K. Sallhammar, B. E. Helvik, and S. J. Knapskog, “Towards a stochastic
https://muni.cz/go/de26e1, 2016, [Online; accessed 13-May-2020].
model for integrated security and dependability evaluation,” in Proceed-
[41] Counter Threat Research Team, “Bronze butler targets japanese en-
ings of First International Conference on Availability, Reliability and
terprises,” https://www.secureworks.com/research/bronze-butler-targets-
Security. IEEE, 2006, pp. 8–pp.
japanese-businesses, 2017, [Online; accessed 13-May-2020].
[19] H. Wang, Y. Liang, and X. Liu, “Stochastic game theoretic method [42] MITRE ATT&CK Team, “Bronze butler,”
of quantification for network situational awareness,” in Proceedings of https://attack.mitre.org/groups/G0060/, 2019, [Online; accessed 13-
2008 International Conference on Internet Computing in Science and May-2020].
Engineering. IEEE, 2008, pp. 312–316. [43] M. M. Drugan and A. Nowe, “Designing multi-objective multi-armed
[20] K. Sallhammar, B. E. Helvik, and S. J. Knapskog, “A framework for bandits algorithms: A study,” in The 2013 International Joint Conference
predicting security and dependability measures in real-time,” Interna- on Neural Networks (IJCNN), 2013, pp. 1–8.
tional Journal of Computer Science and Network Security, vol. 7, no. 3, [44] P. Theron, A. Kott, M. Drašar, K. Rzadca, B. LeBlanc, M. Pihelgas,
pp. 169–183, 2007. L. Mancini, and F. de Gaspari, Reference Architecture of an
[21] X. Liang and Y. Xiao, “Game theory for network security,” IEEE Autonomous Agent for Cyber Defense of Complex Military Systems.
Communications Surveys & Tutorials, vol. 15, no. 1, pp. 472–486, 2013. Cham: Springer International Publishing, 2020, pp. 1–21. [Online].
[22] M. Fallah, “A puzzle-based defense strategy against flooding attacks Available: https://doi.org/10.1007/978-3-030-33432-1_1
using game theory,” IEEE transactions on Dependable and Secure [45] P. Čeleda, J. Vykopal, V. Švábenský, and K. Slavíček, “Kypo4industry:
Computing, vol. 7, no. 1, pp. 5–19, 2010. A testbed for teaching cybersecurity of industrial control systems,”
[23] S. Musman and A. Turner, “A game theoretic approach to cyber security in Proceedings of the 51st ACM Technical Symposium on Computer
risk management,” The Journal of Defense Modeling and Simulation, Science Education, ser. SIGCSE ’20. New York, NY, USA: Association
vol. 15, no. 2, pp. 127–146, 2018. for Computing Machinery, 2020, p. 1026–1032. [Online]. Available:
[24] D. Grunewald, M. Lützenberger, J. Chinnow, R. Bye, K. Bsufka, and https://doi.org/10.1145/3328778.3366908
S. Albayrak, “Agent-based network security simulation,” in The 10th [46] I. NUTÁR, “Automation of complex attack scenarios [online],”
International Conference on Autonomous Agents and Multiagent Systems Master’s thesis, Masaryk University, Faculty of Informatics, Brno,
- Volume 3, ser. AAMAS ’11. Richland, SC: International Foundation 2017. [Online]. Available: https://is.muni.cz/th/cry3j/
for Autonomous Agents and Multiagent Systems, 2011, p. 1325–1326.
[25] J. Chinnow, J. Tonn, K. Bsufka, T. Konnerth, and S. Albayrak, “A tool
set for the evaluation of security and reliability in smart grids,” in Smart
Grid Security, 01 2013, pp. 45–57.
[26] S. Moskal, B. Wheeler, D. Kreider, M. E. Kuhl, and S. J. Yang, “Context
model fusion for multistage network attack simulation,” in Proceedings
of Military Communications Conference (MILCOM), 2014 IEEE. IEEE,
2014, pp. 158–163.
[27] S. Moskal, S. J. Yang, and M. E. Kuhl, “Cyber threat assessment via
attack scenario simulation using an integrated adversary and network
modeling approach,” The Journal of Defense Modeling and Simulation,
vol. 15, no. 1, pp. 13–29, 2018.
[28] S.-D. Chi, J. S. Park, K.-C. Jung, and J.-S. Lee, “Network security
modeling and cyber attack simulation methodology,” in Proceedings of
the 6th Australasian Conference on Information Security and Privacy,
ser. ACISP ’01. Berlin, Heidelberg: Springer-Verlag, 2001, p. 320–333.
[29] M. Liljenstam, J. Liu, D. Nicol, Y. Yuan, G. Yan, and C. Grier,
“Rinse: the real-time immersive network simulation environment for
network security exercises,” in Workshop on Principles of Advanced
and Distributed Simulation (PADS’05), 2005, pp. 119–128.
[30] A. Futoransky, F. Miranda, J. Orlicki, and C. Sarraute, “Simulating
cyber-attacks for fun and profit,” in Proceedings of the 2nd International
Conference on Simulation Tools and Techniques. ICST, 5 2010.
[31] M. E. Kuhl, M. Sudit, J. Kistner, and K. Costantini, “Cyber attack
modeling and simulation for network security analysis,” in 2007 Winter
Simulation Conference, 2007, pp. 1180–1188.
[32] G. Rush, D. R. Tauritz, and A. D. Kent, “Dcafe: A distributed cyber
security automation framework for experiments,” 2014 IEEE 38th Inter-
national Computer Software and Applications Conference Workshops,
pp. 134–139, 2014.
[33] H. Holm and T. Sommestad, “Sved: Scanning, vulnerabilities, exploits
and detection,” in MILCOM 2016 - 2016 IEEE Military Communications
Conference, 2016, pp. 976–981.

15
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

A New Julia-Based Parallel Time-Domain


Simulation Algorithm for Analysis of Power
System Dynamics
Michael Kyesswa, Philipp Schmurr, Hüseyin K. Çakmak, Uwe Kühnapfel, Veit Hagenmeyer
Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
{michael.kyesswa, hueseyin.cakmak, uwe.kuehnapfel, veit.hagenmeyer}@kit.edu, philipp.schmurr@gmail.com

Abstract—The present paper describes a new parallel time- analysis problem. This has led to an increasing interest in
domain simulation algorithm using a high performance com- the implementation of parallel and distributed algorithms for
puting environment – Julia – for the analysis of power system power system analysis [1]. The need for parallel solutions is
dynamics in large networks. The parallel algorithm adapts a
parallel-in-space decomposition scheme to a previously sequential further supported by the current advances in computational
algorithm in order to develop a new parallelizable numerical technology shifting from running applications on single com-
solution of the power system equations. The parallel-in-space de- puters with a single processor to a distributed and parallel
composition is based on the block bordered diagonal form, which computing architecture.
reformulates the network admittance matrix into sub-blocks that
can be solved in parallel. For the optimal spatial decomposition A number of methods and applications of high performance
of the network, a new extended graph partitioning strategy is computing in power system studies have been reported in
developed for load balancing and minimizing the communication literature [2]. Specific to dynamic simulations, parallel com-
between subnetworks. The new parallel simulation algorithm puting techniques are developed by decomposing the system
is tested using standard test networks of varying complexity. and models into subsystems that can be parallelized, and de-
The simulation results are compared to those obtained from
a sequential implementation in order to validate the solution vising alternative algorithms which offer more parallelization
accuracy and to determine the performance improvement in potential. The algorithms for decomposing the system into
terms of computational speedup. Test simulations are conducted parallelizable operations can be broadly divided into parallel-
using the ForHLR II supercomputing cluster and show a huge in-space, parallel-in-time, and waveform relaxation [3]. In the
potential in computational speedup with increasing network parallel-in-space algorithms, the network is partitioned into
complexity.
Index Terms—Graph partitioning, parallel computing, power independent subnetworks and the subnetwork equations are
systems, time-domain simulation, transient stability analysis assigned to different processors. This method was applied in
[4] using a Block Bordered Diagonal Form (BBDF) and in [5]
I. I NTRODUCTION using a Multi-Area Thevénin Equivalent (MATE) algorithm.
The parallel-in-time algorithms consider the combination of
The power system sector has seen an increase in the integra-
the differential and algebraic equations over several time
tion of renewable and distributed generation as a contribution
steps to create a larger system, which can then be solved
to the inter-sectoral effort to address the climate change
simultaneously as described in [6]. The waveform relaxation
challenges. Due to the variable nature of renewable energy
method was introduced in [7] for transient stability analysis
sources, flexibilities such as demand side management and
and implemented on a parallel computer in [8]. This method
storage devices are integrated in the power system towards
separates the system of differential algebraic equations into
a successful energy transition. In addition, the complexity
subsystems and distributes them to different processors to be
of the power system is further increasing in light of the
solved simultaneously.
current operation of large interconnected networks and an
increase in electricity demand from e.g. electric vehicles and The above approaches address the fundamental aspects
heat pumps. From the power system analysis perspective, the required for the parallelization of the power system problem
impact of these changes in operation conditions is an increase using decomposition schemes and numerical solutions based
in computational complexity in the simulation tools applied for on implicit integration methods for the discretization of the
stability and control studies. This implies that the traditional differential equations. However, these approaches have not
simulation tools require significant improvements through the been applied to adapt the optimized and efficient sequential
use of more efficient state-of-the-art computing environments algorithms based on explicit integration methods to parallel
and application of high performance computing hardware to computations using state-of-the-art high performance comput-
cope with the increasing complexity in the power system ing environments. Moreover, the iterative solution in implicit
methods implies a larger amount of computations at each
time step than in the solution using explicit methods, which
978-1-7281-7343-6/20/$31.00 ©2020 IEEE influences the numerical efficiency. Furthermore, since the fun-

16
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

damental parallel approaches rely on system decomposition, restructuring of the network equation coefficient matrix and
it is of interest to apply optimal network decomposition in applying graph partitioning strategy.
the formulation of the parallel solutions in order to balance
the processor loads and minimize communication between A. General Power System Representation
subnetworks. An optimized network partitioning results in a The general model of the power system for transient stabil-
small interconnecting system between the subnetworks. This ity analysis is described by a set of differential and algebraic
necessitates new efficient solution techniques for the network equations of the form
algebraic equations, since the conjugate gradient method in the
ẋ = f (x, y, u) (1)
parallel-in-space approach presented in [4] is considered to be
inefficient for small optimized dimensions of the interconnect 0 = g(x, y, u) (2)
partition.
In the present paper, a parallel-in-space decomposition where x is a vector of dynamic state variables, y is a vector
scheme is used to adapt a fundamentally sequential algo- of algebraic variables, and u are system parameters. The set
rithm to a parallel simulation algorithm using a high per- of differential equations represents uncoupled subsets used to
formance computing environment–Julia [9]. The proposed model the dynamic behavior of synchronous generators and
method adapts parallelism to the sequential program on the al- the connected controllers. The generator controllers include the
gorithmic level for the solution of the network algebraic equa- excitation control system for regulating the generator terminal
tion. This is due to the fact that analysis of the sequential time- voltage [11] and the turbine-governor system for controlling
domain numerical solution shows that solving the network the rotational speed and input mechanical power [12]. The
equation consumes a huge amount of time [5]. The proposed generator subsystems are coupled to each other through the
parallel method therefore restructures the sequential solution transmission network. Other components that are represented
of the network algebraic equation in such a way that allows by differential equations include Static Var Compensators
parallelization of the network solution. This restructuring is (SVC), dynamic loads and HVDC devices. The set of algebraic
based on the block bordered diagonal form for reformulation equations comprises the stator equations of each generator,
of the network coefficient matrix. coupled to the equations of the transmission network and
The network spatial decomposition applied in the present static loads. The interface between the generators and network
paper, however, requires a grid partitioning strategy. For this, system is through the stator algebraic equations included in the
a new extended optimal graph partitioning approach is pro- network nodal equation given by
posed to obtain balanced subnetworks which can be solved in YV =I (3)
parallel and only linked via an interconnect partition to share where Y is the network nodal admittance matrix of order n×n,
information at every time step. In such a decomposition, the for n nodes in the system, V and I are vectors of node voltages
set of differential equations is solved in parallel due to the and current injections of order n, respectively.
natural decoupling of the machines. The proposed numerical
In the present paper, the system of equations in (1) and (2)
solution of the algebraic equations in the presented algorithm
is solved based on the alternating solution scheme in which
is based on an efficient implementation of the direct LU
the differential equations and algebraic equations are solved
factorization method [10]. Combined with the optimization
separately at every integration step [13]. The set of discretized
of the network partitions, the approach in the present paper
equations in (1) is solved for xn+1 , which is then substituted
provides a computational advantage for the solution of the
into equation (2) to solve for yn+1 . The discretization method
algebraic equations.
applied in this case is an explicit integration scheme using the
The rest of the paper is organized as follows: Section II Runge-Kutta fourth order method [14]. The following steps
gives a detailed description of the applied methodology for summarize the sequential solution of the transient stability
the parallel formulation of the dynamic simulation solution and analysis problem:
the applied network partitioning strategy. The implementation
of the extended network partitioning method and the new (a) Determine initial steady state operating conditions at t =
proposed parallel dynamic simulation are described in Section 0 : V0
III. Section IV presents results to validate the accuracy and (b) Initialize dynamic state variables x0 and compute the
evaluate the performance of the proposed parallel method. initial algebraic variables
Finally, Section V concludes the paper, including an outlook (c) At time t + 1, calculate the dynamic state variables xt+1
for the future work. from the discretized form of (1) using the known values
at time t.
(d) Compute the algebraic variables yt+1 from 0 =
II. M ETHODOLOGY
g(xt+1 , yt+1 ) using the known values of xt+1 .
The present section gives an overview of the general repre- (e) Repeat steps (c) and (d) at every further time step.
sentation of the power system for transient stability analysis Among the solved algebraic equations in step (d) is the
considered in this paper and then describes the formulation network equation in (3) for the unknown node voltages.
of the parallelizable solution of the network equation by The node current injections I from the system machines are

17
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

calculated from the dynamic state variables. The solution of Equation (5) shows the preprocessing step where the matrix
the network equation is based on LU factorization [14] of the Ybs and vector Ibs are calculated in (6) and (7), respectively.
admittance matrix Y and solving for V from a linear set of The preprocessing step is used to compute the boundary
equations using the defined node current injections. Further node voltages Vs , which are required for the second step to
details about the models and the sequential solution process separately solve for the node voltages Vi in the ith subnetwork
are given in [15], [16]. Analysis of the computation time of as given in (8). The solution of the subnetwork node voltages
the individual stages of the sequential dynamic simulation in (8) can therefore be performed in parallel and constitutes
algorithm shows that the solution of (3) takes up a large the parallelizable task in the network solution implemented in
percentage of the total runtime in a single simulation, as the present paper.
also stated in [5]. It is therefore of interest to reformulate Further speedup can be realized by optimizing specific
the equation into a form where effective parallelization of the computations in the preprocessing step. For instance, the
solution approach can be applied as explained in the following solution of matrix Ybs in (6) can be completely computed at the
section. beginning of the simulation. In (7), the vector Ibs is calculated
at every time step of the simulation from current vector Is ,
B. Formulation of Parallel Solution which changes throughout the simulation. However, the matrix
T
In the present paper, a parallel-in-space approach is applied product Y i · Yi−1 in (7) can be pre-computed as part of the
to reformulate the network equation into an alternative form simulation optimization strategy. The solutions of the main
that can be easily parallelized during the network solution step. equations (5) and (8) are based on the LU factorization [14]
The parallel form of the problem is formulated based on the of the respective admittance matrices. The matrices Ybs and Yi
block bordered diagonal form as described in [17]. This spatial can be factorized by LU decomposition beforehand to speed
network decomposition forms the basis of the parallel time- up the calculation during the simulation loop.
domain simulation introduced in the present paper. The BBDF From the above formulation, solving the block bordered
formulation restructures the network in such a way that the n diagonal form initially requires a sequential solution for a
network nodes are grouped into p+1 sub-blocks; where p is the vector of boundary node voltages from (5), which grows with
number of subnetworks and the (p + 1)th sub-block represents more than linear complexity. An important consideration in the
the boundary nodes interconnecting the different subnetworks. formulation of the parallel solution is an efficient partitioning
The subnetworks are created from partitioning the grid. strategy of the network with a minimum number of boundary
nodes in the (p+1)th sub-block (interconnect partition) for an
The result of the BBDF formulation is shown in the re-
optimal performance. The graph partitioning strategy extended
structured network admittance matrix and the network nodal
for application to dynamic simulations is described in the
equation
following section.
Y1 Y1
     
V1 I1 C. Network Partitioning
 Y2 Y 2  V2  I2 

.. ..   .   . 
     Since power grids can be naturally represented as graphs,
· . = .  (4)

. .  network partitioning can be formulated as a graph partitioning
 . .


 Yp Y p  Vp  Ip  optimization problem. The main criteria for optimizing per-
T T T
Y 1 Y 2 ... Yp Ys Vs Is formance during the solution of (4) defined by the system of
equations in (5) – (8) are a minimum number of branches that
where Yi are the elements of the original Y matrix within connect partitions and a balance in the sizes of the subnetworks
subnetwork i; Ys is the nodal admittance matrix formed by the or partitions. These criteria are similar to the main require-
boundary nodes in the (p + 1)th sub-block. The Y i elements ments in graph partitioning [18], where the objective is to
consist of data regarding the branches that connect subnetwork minimize the number of cut edges. However, the power grids
i to the (p + 1)th sub-block. in the present paper are considered to be unweighted graphs
The new formulation of the network nodal equation in (4) with a weight function of one for every branch. Thereby, a
rearranges the solution of the equation into two steps; the minimum number of cut edges results in a minimal number
preprocessing step and one step for every subnetwork of the of branches that interconnect to other partitions. This section
matrix given in (5) – (8). describes the partitioning strategy applied for the dynamic sim-
Ybs Vs = Ibs (5) ulations presented in this work: the basic partitioning format
using graph partitioning and the extension to the interconnect
p partition format.
X T
Ybs = Ys − Y i Yi−1 Y i (6) 1) Basic Graph Partitioning Format: A multilevel graph
i=1 partitioning approach, known as the Karlsruhe Fast Flow
p
X T Partitioner (KaFFPa) algorithm [19], is used in the present
Ibs = Is − Y i Yi−1 Ii (7) paper to generate equally sized partitions that have a minimal
i=1 number of cut branches. This partitioning output of the algo-
Yi Vi = Ii − Y i Vs (8) rithm is referred to as the “basic partitioning format” in this

18
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

paper. A multilevel graph partitioning process is defined by Basic partition format Interconnect partition format

the following three steps: coarsening, initial partitioning, and Subsystem 1 Subsystem 2
refinement [20], [21]. In the coarsening step, the algorithm Y1 , Y 1 Y2 , Y 2
Eqn (8) Eqn (8)
contracts the input graph to create a smaller representation of
the graph. The contraction is based on a matching strategy, Subsystem Subsystem I1 , Y1 , Y 1 Vs Vs I2 , Y2 , Y 2
1 2 Interconnect
which identifies a set of edges that do not have a common
Ys
end point (vertex) [22]. A matching is then contracted by Eqn (5) – (7)
combining the start and endpoint of every edge in the set; Subsystem
thus decreasing the size of the input graph. As soon as 3 Vs I3 , Y3 , Y 3
the graph is small enough, the initial partitioning step is Subsystem 3
applied using a global partitioning algorithm. The KaFFPa Y3 , Y 3
algorithm uses the SCOTCH global partitioning algorithm [23] Eqn (8)

for the initial partitioning step. The refinement step involves


Fig. 1. Formation of interconnect partition format for a three-subsystem
iteratively uncontracting the matchings, while applying local network
refinement strategies to improve the number of cut branches
and balance the partitions.
A detailed description of the KaFFPa algorithm is given in transforming the system from the initial basic partition format
[21] and [19]. The result is a partitioning with evenly sized par- to the interconnect partition format is illustrated in Fig. 1,
titions and a minimal number of edges that are cut in between which shows the data streams for the interaction between
the partitions. This is desirable for even distribution of work partitions. The nodes in the interconnect partition serve as the
between parallel tasks and guarantees a minimal overhead for link for exchanging information between the main partitions.
handling the relations between partitions. However, the basic The formulation in Fig. 1 shows that the equations in the
partitions cannot be directly applied to the parallel formulation subnetworks can be solved in parallel. The equations formed
of the network equation, which requires an interconnecting by the interconnect partition have to be solved sequentially at
partition between the main subnetworks. The basic partition- every time step. The sequential part of the solution depends
ing is therefore modified into an interconnecting-partitioning on the size of the interconnect partition. The number of nodes
format as described in the following. in the interconnect partition is optimized by minimizing the
2) Interconnect-Partitioning Format: The restructured ad- number of cut branches between partitions from the basic
mittance matrix in (4) consists of a set of nodes forming the partitioning format.
interconnect partition. This set of nodes is formed such that the
nodes in the main subnetworks are only connected to the nodes III. I MPLEMENTATION OF S OLUTION
in the interconnect partition; thus no direct connection exists The parallel time-domain simulation is implemented in
between the different subnetworks. This requires a special the Julia programming environment [9], which offers good
partition format to create an extra partition containing the performance and great potential of parallelization. This section
interconnecting nodes under the condition that a node in the describes the implementation of the proposed network parti-
interconnect partition cannot be included in any of the main tioning strategy and parallel solution of the transient stability
sub partitions. In the present paper, the proposed strategy to problem.
derive the required partitioning format is referred to as the
“interconnect-partitioning format”. A. Extended Partitioning Format
The inputs to the interconnect-partitioning algorithm are the The inputs to the partitioning algorithm are the unpartitioned
basic network partitions from the graph partitioning algorithm. network files and the required number of graph partitions p.
The procedure of the interconnect partition format is sum- The network files provide the topological information required
marized as follows: Initially, the nodes are ranked according by the partitioning algorithm. In the presented implementation,
to the number of adjacent vertices connecting to a different the input networks are defined based on the Matpower casefile
partition. The node with the most of these edges is moved to format [24]. The initial step of the algorithm is to convert
the interconnect partition. The ranking of the nodes is then the network topology into a graph format to be used by the
updated and the next node is picked. With this selection, a KaFFPa graph partitioning program. The graph definition in
minimum size of the interconnect partition is formed for a the KaFFPa program is based on the Metis graph format [25].
given set of input partitions. In cases where multiple nodes Algorithm 1 summarizes the initial graph partitioning process.
exist with the same number of cut branches, the secondary The text file from Metis graph format and the required
selection criterion is the size of the main partition. The node partitions p are the inputs to the KaFFPa partitioning program.
from the largest partition is selected; thus maintaining a bal- The output of the program is the partitioned graph in text
ance in terms of the partition sizes. The partitioning process is format with n rows. The nth row corresponds to the nth
terminated when all the adjacent vertices to other partitions are vertex of the graph. Each row defines the partition (1 to
eliminated and the connection between partitions exists only p) in which the corresponding row vertex is located. This
through the interconnect partition. The partitioning process for output defines the basic partition format. A Matlab algorithm

19
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

is implemented to convert the basic partition format into Algorithm 2: Interconnect partitioning format
the interconnect partition format. The implementation of the Inputs: b branches, partition indices (1 . . . p) for nodes
converter algorithm is based on the procedure described in Assign interconnect partition index p + 1
Section II-C and summarized in Algorithm 2. for each branch do
B. Parallel Dynamic Grid Simulation Determine partition indices of f rom and to nodes
if f rom 6= to & partition index 6= p + 1 then
The inputs to the parallel time-domain simulation algorithm branch → cut branch; nodes → boundary nodes
include the network casefile in Matpower format and the end
corresponding subnetworks formed by the preprocessing parti- Ranking boundary nodes:
tioning procedure described in Section III-A. The initial step in for boundary nodes do
any dynamic simulation algorithm is the power flow solution to Rank based on branch count to other partitions
establish a quasi-steady state starting point for the simulation. if nodes exist with equal branches then
In the present implementation, the power flow calculation is Rank based on partition size of node location
based on the PowerModels package [26] in Julia. In terms of end
implementation, the advantage of applying the PowerModels Move highest ranking node to index p + 1
package is that the network data format is consistent with Update list of boundary nodes
the Matpower file format. With this property, the network Repeat Until set of boundary nodes is empty
casefiles defined in Matpower can be directly used in the end
Julia simulation. Additional inputs for the parallel dynamic end
simulation are the dynamic model parameter and network Return: partitions 1 to p and interconnect partition p + 1
events files. These files are defined in Matlab and directly
called within the Julia algorithm using the Julia package
Matlab.jl. Algorithm 3: Parallel computation procedure
The parallelization in the dynamic simulations is limited to Inputs: Network casefile and partitions
a single time step, since the solutions are based on the step- Initialization: V0 , X0
by-step numerical solution. The first parallelization step is the Precomputation:
computation of the decoupled machine differential equations Form subsystem matrices Yi , Y i and Ys
using the fourth order Runge-Kutta method to obtain the T
Compute Ybs and product Y i Yi−1
node current injections. The second step is the computation
LU factorize Yi and Ybs
of the BBDF formulated network equation. The solution of
for each partition do
the network consists of the precomputation steps, which are
Calculate machine state variables Xi
mainly matrix construction steps required for the sequential
Compute current injection in each partition Ii
solution of the interconnect partition equations, and the parallel
end
solution of the subnetwork equations. For the task of solving
the linear network system, the sparse LU factorization solver Compute link currents Ibs
using the UMFPACK library [10] is applied. The main steps Solve for the interconnect subnetwork voltages Vs
of the parallel algorithm are summarized in Algorithm 3. for each partition do
Solve for subnetwork node voltages Vi
C. Communication Aspects end
Return: State and algebraic variables at each time step
The parallel dynamic simulation algorithm proposed in
the present paper is based on a single node parallelization.
The main computations in the algorithm are memory bound
tasks dealing with vector arithmetic, matrix multiplication, and Fig. 2 illustrates the simulation time line for the imple-
solving of the network equation. In the Julia environment, such mented parallel dynamic simulation for an example with two
a parallelization problem can be effectively handled using the partitions (p1 , p2 ) and the interconnect partition (p + 1). At
multithreading construct [9]. the initialization step, the unpartitioned network is used to
establish the quasi steady state conditions of the system. The
steady state dynamic and algebraic variables of each subnet-
Algorithm 1: Initial graph partitioning work are derived from the unpartitioned network conditions.
Inputs: Network casefile; n−nodes, b−branches The subnetworks precompute their corresponding internal ad-
Required partitions p mittance matrices and the boundary matrix elements. The pre-
Eliminate recycling branches computation also includes the LU factorization of the internal
Build graph G in text format from network topology admittance matrices required for the solution of (8). After
Partition G into p subsystems all the subnetworks have finalized the matrix precomputation
Return: Graph G and p sub systems step, the interconnect partition starts receiving the matrices
of the individual subnetworks to precompute the interconnect

20
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Partition 1 Interconnect Partition 2 TABLE I


0 = g0(y 0) S UMMARY OF THE SIZE OF APPLIED TEST NETWORKS

Network Gens Buses Branches Loads


x0s , ys0 Y2, Y 2, x02, y20
tinit

Y1, Y 1, x01, y10 Case9 3 9 9 3


Case30 6 30 41 20
Ys, Ycs Case118 54 118 186 99
t0 Case300 69 300 411 201
x1, I1 xs, Is x2, I2
t1 Case1354pegase 260 1354 1991 673
t2 Case9241pegase 1445 9241 16049 4895
Vs, Ibs Case13659pegase 4092 13659 20467 5544
t3
t4
V1, y1 ys V2, y2
t5 structures with varying partitions. The aim is to examine the
t6 xt, yt
speedup in relation to the network size and partitioning.
Simulation time A. Simulation Setup
Fig. 2. Representation of communication aspects for a two-partition system The simulations presented in this section have been obtained
showing the initialization step (tinit ) and one simulation time step (t0 − t6 ) using the high performance computing cluster ForHLR II
[28] at Karlsruhe Institute of Technology (KIT). The system
consists of multiple nodes interconnected by an InfiniBand
admittance matrix according to (6).
4X EDR Interconnect to provide high link speed. Each of
At time t0 , the subnetworks start solving their correspond- the nodes consists of two Intel Xeon E5-2660 v3 Deca-Core
ing differential equations and determine the node current processors and 64 GByte of RAM. The dual socket setup
injections. At t1 , the interconnect partition starts receiving delivers 20 usable cores per node for simulation usage. In
the current injections from the subnetworks. At t2 , the link the presented simulation test cases, all benchmark runs have
currents are computed according to (7) and the boundary node been conducted on a single thin node of the ForHLR II cluster
voltages are calculated from (5). During the time period t2 −t3 , computer.
the subpartitions are in waiting mode for the boundary voltage The input network case files used for testing the algorithm
inputs from the interconnect partition. At t3 , the interconnect are standard test systems consisting of IEEE benchmark
partition starts sending the boundary node voltages to the models and large network models representing the European
subnetworks. This is followed by the solution of (8) for the power grid from the Pegase project [29]. The input files of
node voltages at t4 . Once the subnetworks have updated their the network structures are obtained from Matpower casefiles.
node voltages at t5 , the local dynamic state variables and Table I summarizes the setup of the considered networks in
algebraic variables are saved by the main processor at t6 . terms of number of generators (gens), buses, branches and
The total simulation time for one time step is the sum of loads.
time taken for the computations from t0 to t6 . This is true In each test case, the simulation is run for 10 s, with a step
for a simulation with a fixed network topology. However, a size of 1 ms. In addition, an event – in form of a bus fault
change in network operating conditions, resulting from load – is triggered during the simulation. The fault is applied by
changes, network faults, loss of generator or transmission inserting a high shunt value on a bus (representing a short
line, causes a change in topology. This therefore necessitates circuit) for a given duration and cleared by resetting the bus
reformulation of the corresponding admittance matrices in the shunt to the original value. This results in two events in the
subnetworks, in a process referred to as event handling in this process, which implies that the network topology changes
paper. In a time step involving event handling, the simulation twice during the simulation.
time includes the computation time for reformulating and
factorizing the admittance matrices, and the solution of the B. Validation of Simulation Accuracy
algebraic equations (5) – (8) to update all algebraic variables In the first step of the evaluation, the Julia-based parallel
before the next time step. This is carried out at every event of dynamic algorithm is validated against the sequential Matlab-
change in network topology. based dynamic simulation toolbox described in [27]. The
results obtained from the two simulations are compared to
IV. VALIDATION AND P ERFORMANCE E VALUATION evaluate the level of accuracy of the implemented component
The current section presents application cases to evaluate models and numerical solution strategy in the proposed par-
the accuracy and performance of the proposed simulation. In allel dynamic simulation algorithm. For purposes of illustra-
the first case, the accuracy of the parallel dynamic simulation tion, simulation results using a simple network case – the
is compared to the open-source Matlab based sequential time- IEEE nine-bus system (Case9) – are presented to analyze
domain simulation as presented in [27]. The second case the accuracy of the algorithm. The network is set up as
evaluates the speedup of the parallel algorithm in reference described in [15], with similar generator models and controller
to sequential implementation considering different network system models (turbine-governor and exciter). The network is

21
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

partitioned into two subsystems for application to the parallel ×10−3


dynamic simulation. The resulting network partitions using the
MatDyn ∆ω1
new extended partitioning scheme are as follows: 4
MatDyn ∆ω2

Speed deviation [pu]


• Partition 1 – [2, 3, 8, 9] MatDyn ∆ω3
• Partition 2 – [1, 4, 5] 3 Parallel ∆ω1
• Interconnect partition – [6, 7] Parallel ∆ω2
2 Parallel ∆ω3
The numbers represent the bus indices in the network. In
order to test the simulation of events in the network, a three- 1
phase short circuit fault is applied on bus 5 at time t = 1.2 s
and cleared at t = 1.25 s. Fig. 3 to Fig. 5 show the comparison 0
of the results using the sequential Matlab-based simulation and
1 2 3 4 5
the proposed parallel dynamic simulation in Julia. The plotted
variables are generator relative rotor angle, rotational speed Time [s]
deviation, and bus voltage magnitude. In Fig. 3, the relative
Fig. 4. Comparison of generator rotational speed deviation following a bus
rotor angle is computed with respect to generator 1 rotor angle fault on bus 5 in MatDyn toolbox and the new Julia-based parallel dynamic
δ1 , giving δ21 = δ2 − δ1 and δ31 = δ3 − δ1 . The rotational simulation tool
speed deviation ∆ω in Fig. 4 is calculated with respect to the
synchronous speed ωs = 1 pu. Key system buses, including the 1
generator buses and the fault bus, are selected for illustration
MatDny Bus1
of the voltage response in Fig. 5. 0.8
MatDyn Bus2
From the results shown in Fig. 3 – Fig. 5, it is observed that

Voltage [pu]
MatDyn Bus3
there is a perfect match in the results between the proposed 0.6
MatDyn Bus5
parallel simulation and the sequential simulation. This shows Parallel Bus1
0.4 Parallel Bus2
that the BBDF formulation of the network equation and
Parallel Bus3
solution using subnetworks correctly replicates the results of 0.2 Parallel Bus5
the original network equation formulation. The perfect match
can be attributed to application of the same numerical solution 0
strategy in both simulation cases based on the Runge-Kutta 1 1.2 1.4 1.6 1.8 2
method for discretization of the differential equations and a Time [s]
series of LU factorization for the solution of the algebraic
equations. Fig. 5. Comparison of bus voltage response following a fault on bus 5 in
MatDyn toolbox and the new Julia-based parallel dynamic simulation tool
C. Evaluation of Performance
TABLE II
In order to evaluate the performance of the parallel dynamic O PTIMAL NETWORK PARTITIONING COUNT
grid simulation, the sequential version of the algorithm is
initially extended to a similar programming environment, Network Optimal Average Interconnect
partition size size
Julia. The evaluation is performed in terms of computational Case9 2 3 2
speedup. The performance of the sequential and parallel algo- Case30 4 6 6
rithms in the Julia environment is evaluated using the different Case118 6 16 18
Case300 5 57 15
Case1354 7 188 37
Case9241 16 567 161
40
Case13659 10 1356 97
Relative angle [deg]

35
MatDyn δ21 test networks in reference to the Matlab-based sequential
30 MatDyn δ31
simulation toolbox in [27]. For the parallel simulation, the
Parallel δ21
Parallel δ31
optimal partitioning count is considered for the comparison.
25 Table II gives a summary of the optimal partitioning results
for the different network structures. The simulation runtimes
20 in the three algorithms are illustrated in Fig. 6 to compare the
1 2 3 4 5 minimum computation runtimes.
Time [s] Fig. 6 shows that the sequential and parallel Julia implemen-
tations are faster than the Matlab-based implementation for all
Fig. 3. Comparison of generator relative rotor angles following a bus fault test cases. This performance difference is attributed to the high
in MatDyn toolbox and the new Julia-based parallel dynamic simulation tool performance capability provided by the Julia programming en-

22
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

104 a larger interconnect partition, which increases the sequential

1,274.38
runtime of the network solution. At the same time, the

556.45
parallelizable partitions differ and decrease in size, causing

379.56
103

246.82
an increase in waiting times and less parallelizable processing

212.31
134.83
than the sequential task. Therefore, the quality of partitioning
Runtime [s]

affects the overall computation.

48.6
102

25.22
The factors influencing the performance in the presented

18.52

17.16
15.34
13.28
12.81

parallel algorithm are: (i) the increase in problem complexity

6.64
due to the restructuring using the BBDF formulation and (ii)

4.73
101
3.95
3.19
the data exchange in a simulation step. For small networks,
2.11
1.93

1.78
1.54

the size of the resulting parallelizable tasks is generally small


100 due to the small number of buses distributed to each sub-
Case9 Case30 Case118 Case300 Case1354 Case9241Case13659 network. This implies that the increase in problem complexity
MatDyn New Sequential New Parallel and the data exchange overhead outweigh the parallelization
benefits. As the network size increases, grid partitioning results
Fig. 6. Comparison of computational runtime of dynamic simulations in the in a significant size of parallelizable tasks since a considerable
MatDyn toolbox, and using the new sequential algorithm and new parallel number of buses can be placed in each subsystem. In this case,
algorithm in Julia
the parallelization benefit outweighs the increase in problem
complexity, thus achieving a speedup greater than one.
TABLE III
N ETWORK PARTITIONING COUNT AND RESPECTIVE COMPUTATIONAL
SPEEDUP
300 Sequential solver runtime
Network Case9 Case30 Case118 Case300 Case1354 Case9241 Case13659
partitions Parallel solver (par time)
Runtime [s]

1 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 200 Parallel solver (seq time)
2 0.8006 0.7776 1.0251 1.0641 0.9064 0.9046 0.9009
Parallel solver (dataIO)
3 0.7805 0.7770 0.9821 0.7486 1.0742 1.0389 1.0255
4 0.8449 1.1902 1.3385 1.2949 1.1079 1.1625 100
5 0.8176 1.2385 1.4028 1.3297 1.2058 1.3096
6 0.7448 1.2390 1.3491 1.3590 1.2201 1.2706
7 0.7760 1.0787 1.2970 1.4696 1.3118 1.3962
0
8 0.7669 1.1326 1.2765 1.3981 1.4295 1.3682 Case300 Case1354 Case9241 Case13659
10 0.7130 1.0745 1.2657 1.2079 1.5049 1.5380
12 1.0185 0.7766 1.2572 1.5445 1.5183
14 0.9790 1.1573 1.3206 1.5532 1.4350 Fig. 7. Summary of runtime of the numerical solver stage in the new
16 0.9633 1.1328 1.4160 1.5746 1.4905 Julia-based parallel dynamic simulation tool; showing the components of the
18 0.9319 1.0927 1.2374 1.5742 1.5192 parallel solver runtime compared to the sequential solver runtime
20 0.9023 1.0657 1.3920 1.5090 1.5188
However, the low speed up observed in the presented results
can be explained as follows: The communication flow in Fig. 2
vironment. With this in mind, the sequential Julia computation shows that a simulation step consists of two parallel stages.
runtimes are used for further analysis of the speedup attained Between the two stages, variables have to be collected by
by the parallel dynamic simulation. The computed speedup is the main partition in order to execute the sequential step of
shown in Table III with various partition sizes for all tested the network solution and then send the results back to the
network. The speedup is computed according to the relation individual partitions for the second parallel stage. Fig. 7 shows
Speedup = Ts /Tp , where Ts is the runtime in the sequential a summary of the simulation runtime for the numerical solver
simulation and Tp is the runtime in the parallel simulation. stage, which is the main computation in a simulation step. The
The optimal partitioning resulting in the best speedup for each runtime of the numerical solver stage in the parallel algorithm
simulated network is highlighted in Table III. consists of the parallel component for the solution of the
From the above results, the simulations in the proposed differential equations and the network equations for the node
parallel dynamic algorithm are relatively slower than the voltages (par time), the sequential solution for the link voltages
corresponding sequential simulations for small network cases. (seq time) and the data exchange time (dataIO) as shown in
However, the parallel runtime shows significant improvements Fig. 7. For comparison, the runtime of the numerical solver
with increasing network sizes as shown in Fig. 6, where in the sequential algorithm (Sequential solver runtime) is also
Case9241 and Case13659 achieve 57.46% and 53.8% speedup, included in the figure.
respectively. Furthermore, the speedup is observed to vary with Analysis of Fig. 7 shows the solver parallel component –
the number of partitions. For each of the tested networks, an parallel solver (par time) – is small compared to the total solver
optimal partitioning exists at which the computation runtime runtime. Furthermore, the parallel solver sequential component
is minimum as shown in Table III. This behavior can be – parallel solver (seq time) – is a very small percentage of
explained as follows: A higher number of partitions results in the solver runtime. This is due to the fact that the sequential

23
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

step is optimized using an efficient LU solver and optimized [8] L. Hou and A. Bose, “Implementation of the waveform relaxation
network partitioning. Therefore, the major factor for the low algorithm on a shared memory computer for the transient stability
problem,” IEEE Transactions on Power Systems, vol. 12, no. 3, pp.
speedup is the data exchange resulting in a high number of 1053–1060, Aug. 1997.
synchronizations in the parallel execution. Part of the future [9] J. Bezanson, A. Edelman, S. Karpinski and V. Shah, “Julia: A fresh
work is to further optimize the simulation algorithm in order approach to numerical computing,” SIAM Review, vol. 59, no. 1, pp.
65–98, 2017.
to reduce the data exchange overhead.
[10] T. A. Davis, “UMFPACK user guide, version 5.6.2,” https://www.suite
V. C ONCLUSION sparse.com, 2013.
[11] IEEE Std 421.5-2005, “IEEE recommended practice for excitation
The present paper introduces a new parallel time-domain system models for power system stability studies,” IEEE, New York,
simulation tool based on the block bordered diagonal form 2006.
for reformulation of the network equation solution. The im- [12] Task Force on Turbine-Governor Modeling, “Dynamic models for
turbine-governors in power system studies,” IEEE Power & Energy
plementation relies on an extended partitioning strategy for Society, 2013.
decomposing the network structure into parallelizable sub- [13] S. Soman, S. Khaparde and S. Pandit, Computational methods for large
networks exchanging information at every simulation time sparse power systems analysis; An object oriented approach, Kluwer
Academic Publishers, 2002.
step through the boundary buses in the interconnect partition. [14] M. L. Crow, Computational methods for electric power systems, New
The accuracy of the simulation is compared to a validated York: CRC Press, 2009.
sequential simulation toolbox and found to perfectly match in [15] M. Kyesswa, H. Çakmak, U. Kühnapfel and V. Hagenmeyer, “A Matlab-
terms of the derived response profiles. For the tested network based dynamic simulation module for power system transients analysis
in the eASiMOV framework,” in European Modelling Symposium,
cases, a performance improvement is achieved as the network Manchester, UK, Nov. 2017.
size increases. In addition, the computation runtime is seen to [16] M. Kyesswa, H. K. Çakmak, U. Kühnapfel, and V. Hagenmeyer,
be dependent on the quality of the partitioning. “A Matlab-based simulation tool for the analysis of unsymmetrical
power system transients in large networks,” in European Conference
It is important to note that the presented results of the on Modelling and Simulation (ECMS), Wilhelmshaven, Germany, May
parallel dynamic simulation algorithm are obtained using Julia 2018.
version 0.6.3 – with experimental multi-threading status – [17] I. Decker, D. Falcao and E. Kaszkurewicz, “An efficient parallel method
which lacks support for multi-threading of nested loops. Part for transient stability analysis,” in Proceedings of the Tenth Power
Systems Computation Conference, 1990.
of the future work is to extend the algorithm to Julia versions [18] A. Buluç, H. Meyerhenke, I. Safro, P. Sanders and C. Schulz, “Re-
that provide general task parallelism properties as described cent advances in graph partitioning,” in Algorithm Engineering, Cham,
in [30]. Furthermore, the parallel algorithm will be extended Springer, pp. 117–158, 2016.
for testing on more than one computing node of the ForHLR [19] P. Sanders and C. Schulz, “High quality graph partitioning,” in 10th
DIMACS implementation challenge workshop: Graph Partitioning and
II computing cluster. Graph Clustering, 2013.
[20] P. Sanders and C. Schulz, “Think locally, act globally: Highly balanced
ACKNOWLEDGMENT graph partitioning,” in Experimental Algorithms, Springer Berlin Hei-
This work is part of the Energy Systems 2050 project, an delberg, pp. 164–175, 2013.
initiative of the Helmholtz Association. The work was per- [21] P. Sanders and C. Schulz, “Engineering multilevel graph partitioning
algorithms,” in 19th European Symposium on Algorithms, 2011.
formed on the supercomputer ForHLR funded by the Ministry [22] J. Maue and P. Sanders, “Engineering algorithms for approximate
of Science, Research and Arts Baden-Württemberg and by the weighted matching,” in International Workshop on Experimental and
Federal Ministry of Education and Research. Efficient Algorithms, 2007.
[23] F. Pellegrini, “Distillating knowledge about SCOTCH,” in Combinatorial
R EFERENCES Scientific Computing, 2009.
[24] R. D. Zimmerman, C. E. Murillo-Sanchez and R. J. Thomas, “MAT-
[1] R. C. Green, L. Wang and M. Alam, “High performance computing for POWER: Steady-state operations, planning, and analysis tools for power
electric power systems: Applications and trends,” in IEEE Power and systems research and education,” IEEE Transactions on Power Systems,
Energy Society General Meeting, San Diego, CA, 2011. vol. 26, no. 1, pp. 12–19, 2011.
[2] D. M. Falcao, “High performance computing in power system applica-
tions,” in International Conference on Vector and Parallel Processing, [25] G. Karypis and V. Kumar, “A fast and high quality multilevel scheme for
Porto, Portugal, 1997. partitioning irregular graphs,” SIAM Journal on Scientific Computing,
[3] C. Dufour, V. Jalili-Marandi, J. Bélanger and L. Snider, “Power system vol. 20, pp. 359–392, 1998.
simulation algorithms for parallel computer architectures,” in IEEE [26] C. Coffrin, R. Bent, K. Sundar, Y. Ng and M. Lubin, “PowerModels.jl:
Power and Energy Society General Meeting, San Diego, CA, 2012. An open-source framework for exploring power flow formulations,” in
[4] I. Decker, D. Falcao and E. Kaszkurewicz, ”Conjugate gradient methods 2018 Power Systems Computation Conference (PSCC), 2018.
for power system dynamic simulation on parallel computers,” IEEE [27] S. Cole and R. Belmans, “MatDyn, A new Matlab-based toolbox
Transactions on Power Systems, vol. 11, no. 3, pp. 1218–1227, 1996. for power system dynamic simulation,” IEEE Transactions on Power
[5] M. Tomim, J. Martı́ and L. Wang, “Parallel solution of large power Systems, vol. 26, no. 3, pp. 1129–1136, Aug. 2011.
system networks using the Multi-Area Thévenin Equivalents (MATE) [28] “KIT - SCC - ForHLR II,” [Online]. Available: https://www.scc.kit.edu/
algorithm,” International Journal of Electrical Power & Energy Systems, dienste/forhlr2.php.
vol. 31, no. 9, pp. 497–503, 2009. [29] C. Josz, S. Fliscounakis, J. Maeght and P. Panciatici, “AC power flow
[6] M. L. Scala, R. Sbrizzai and F. Torelli, “A pipelined-in-time parallel data in MATPOWER and QCQP format: iTesla, RTE snapshots, and
algorithm for transient stability analysis (power systems),” IEEE Trans- PEGASE,” 2016. [Online]. Available: http://arxiv.org/abs/1603.01533.
actions on Power Systems, vol. 6, no. 2, pp. 715–722, May 1991. [30] J. Bezanson, J. Nash and K. Pamnany, “Announcing composable multi-
[7] M. Crow and M. Ilic, “The parallel implementation of waveform relax- threaded parallelism in Julia,” Julia, 23 July 2019 . [Online]. Available:
ation methods for transient stability simulations,” IEEE Transactions on https://julialang.org/blog/2019/07/multithreading.
Power Systems, vol. 5, no. 3, pp. 922–932, Aug. 1990.

24
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

On Verification of Designed Energy Systems Using


Distributed Co-Simulations
Anselm Erdmann1 , Anna Marcellan2 , Dominik Hering3 , Michael Suriyah4 , Carolin Ulbrich5 ,
Martin Henke2 , André Xhonneux3 , Dirk Müller3 , Rutger Schlatmann5 , Veit Hagenmeyer1
1
Institute for Automation and Applied Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
2
Institute of Combustion Technology, German Aerospace Center, Stuttgart, Germany
3
Institute of Energy and Climate Research, Energy Systems Engineering (IEK-10),
Forschungszentrum Jülich GmbH, Jülich, Germany
4
Institute of Electric Energy Systems and High-Voltage Technology, Karlsruhe Institute of Technology, Karlsruhe, Germany
5
PVcomB, Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Berlin, Germany
anselm.erdmann@kit.edu, anna.marcellan@dlr.de, d.hering@fz-juelich.de, michael.suriyah@kit.edu,
carolin.ulbrich@helmholtz-berlin.de, martin.henke@dlr.de, a.xhonneux@fz-juelich.de, di.mueller@fz-juelich.de,
rutger.schlatmann@helmholtz-berlin.de, veit.hagenmeyer@kit.edu

Abstract—An essential part of the energy systems design the same simulation development environment and therefore
procedure is simulation, since it serves as a tool for verification requires specific expertise.
of the respective design. It serves the verifying of a stable In this context, distributing simulations into co-
operation of developed energy systems infrastructure, before it
comes to the realization. As energy systems integration becomes simulations [13]–[15] enables the collaboration of specialists
an important part in a low carbon energy scenario in the future, from the various areas of expertise. By doing so, the
the cooperation of experts specialized in various domains crucial existing simulations including the latest research and
to single aspects of the energy system is indispensable. Co- insights by specialized experts can be fully integrated into
simulation, yet, enables the modelling in the familiar environ- the co-simulation. Nevertheless, this approach requires a
ment of the experts, but requires a detailed coordination of
the simulation interfaces between the specific expert models. high coordination effort between the cooperation partners.
Hence, standardized interfaces are crucial to the efficient use of Reducing this effort by defining a unique interface between the
expert knowledge in distributed co-simulations. Therefore, in the simulation parts at the beginning simplifies the collaboration
presented paper a workflow for the co-simulation development fundamentally. Therefore, introducing a new approach for
of energy systems simulations, which simplifies the coordination obtaining an interface definition is the main purpose of the
procedure significantly by standardizing the interfaces between
the models and their simulations, is introduced. The approach present paper.
is exemplarily applied to the energy system design of a district The remainder of this paper is organized as follows: Section II
comprising electricity and heat in order to show its successful discusses related work in this area of interest. Section III
performance. and Section IV introduce a coordinating approach to the
Index Terms—co-simulation, energy systems integration, en- distribution procedure of simulations by defining the interfaces
ergy systems design, interfaces
between the simulations on multiple layers. This approach is
I. I NTRODUCTION exemplarily applied to the energy system design of a district
As the international focus on reducing the number of in Section V. Finally, Section VI concludes with a discussion
available fossil power plants intensifies [1]–[4], cheap and and outlook.
strongly fluctuating generation from renewable energies is II. R ELATED W ORK
growing. The challenging transition of the current energy
system into a sustainable multimodal energy system needs Distributed co-simulations for multimodal energy systems
new operating strategies [5]. are rare in the literature. Partly, this rarity is probably due to
Common approaches and surveys about the future design the lack of expert knowledge across all the respective fields.
of the energy system are dealing with a top-level approach In this context, an approach that takes into account the whole
from a national or international perspective (e.g. [6]–[9]). hierarchy between the European transmission grid and local
Yet, operating issues and local needs have to be considered prosumers is contained in the Energy System Development
in further steps. In lower level simulations the operability of Plan (ESDP) [16]. This approach builds local energy cells
the designed energy system can be verified. The modelling of with assumptions for distribution grids. These energy cells are
energy systems comprising many different technologies [10]– interconnected via the real European transmission grid [16].
[12] requires a high effort to consider all system attributes in Thereby, the gap between locally available flexibility and
transregional flexibility demand is bridged by simulating a
market-driven mode of operation. In these cells, the flexibility
978-1-7281-7343-6/20/$31.00 ©2020 IEEE is available for the entire network. Heat is considered in

25
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the form of demands and heat-supplying energy conversion energy flow through the system is considered. All scientific
units. The simulation consists of different steps, in which investigations of the energy system design are dealt with this
a market simulation determines the schedule and the trans- layer. Furthermore, the initial splitting of the logical simulation
mission grid behaviour is simulated in a steady state power parts is performed.
flow simulation [16]. Since the simulation is split into a
sequentially executable toolchain, the data exchange between B. Information flow layer
the simulations is straightforward. Information exchange is an essential part of distributed
Another approach with focus on various heating systems is simulations. The simulation modules are considered as black
MESCOS [17], which also includes an electrical grid simula- boxes with interfaces for data exchange with other simulation
tion. In [17], models are developed in a couple of modelling modules. The calculation of the current state in the black
tools and co-simulated via network communication. Thereby, box can require input data delivered from external simulation
the interfaces between the different simulators are defined in modules.
a scenario description file. In the information flow layer the interfaces of the simulation
Furthermore, in [18] a gas and a heat grid simulation is modules and their dependencies are specified. Each interface
integrated into the OpSim platform [19] by linking already contains the value of a floating point variable, which
existing simulation tools for the respective domains. The represents a physical quantity with one unit. Formally, an
simulation tools used are adapted to the OpSim message bus interface consists of a quadrupel {identifier, type, value, unit},
according to the OpSim Proxy/Client concept [18]. where the identifier is a unique name, the type is either input
Moreover, a simulation framework for multi-carrier energy or output, the value is a floating point number, and the unit
systems is presented in [20]. It is designed for the cooperation is a SI-unit. An input interface is defined for each variable
between experts of various domains in particular. The central that a simulation module requires from another simulation
orchestration of the co-simulation is performed in MATLAB®. module. Similarly, an output interface is defined for each state
The interface design between the different simulations is variable that a simulation module provides for transmission
recognized as a central challenge. However, a high number to other simulation modules. To ensure the operability of the
of different simulation tools is seen as an obstacle [20]. simulations, every input interface needs to be connected with
In [21], requirements for coupling of simulators are identified. an output interface of a fitting quantity.
Besides the special application field of multimodal energy sys- The assignment of the output interfaces to corresponding
tem simulations, there are more general ambitions for coupling input interface is performed according to the scenario defined
physical simulations. Initial efforts to standardize physical in the semantic layer. The information flow is always directed
simulation interfaces on the technical level are already in use from an output interface to an input interface. Bidirectional
with the Functional Mockup Interface (FMI) definitions [22], connections can be represented by two opposite directed
[23]. In the future, the new standard for network co-simulation connections.
DCP [24] could also play an important role. A complete definition of this layer contains a list of all
In this context, to the best of our knowledge, there is no holis- interfaces of all simulation modules, and additionally, the
tic procedure for defining interfaces for the distributed simula- corresponding output interface for each input interface.
tion of energy systems. Hence, it is an open scientific problem
to define these interfaces for enabling a frictionless model
development by the individual domain experts. Therefore, C. Simulators interaction layer
the present paper introduces a clear interface definition for The simulators interaction layer provides the information
energy systems simulations in order to avoid time-consuming exchange between the simulation modules according to the
adaptions of the simulation models and incompatibilities that connections determined in the information flow layer. The
may occur in the end. communication between the simulation modules has to be
clearly specified and implemented by each associated simu-
III. I NTERFACING S IMULATIONS lation module. Alternatively, simulation tools can be adapted
The interfacing has to be carried out in several layers, of to the specified communication format with an interconnected
which the procedure is presented in Section IV. In the present module. Most of the co-simulation standards are using this
section, the stack of simulation interfaces for distributed concept, like the FMI standard [23] for local co-simulation
simulations is introduced. For this, a single simulation that
is a part of the distributed simulation is called a simulation
TABLE I
module. Table I gives an overview of the layers. I NTERFACE LAYERS FOR ENERGY SYSTEM CO - SIMULATIONS

A. Semantic layer
Layer Name Description
The semantic layer considers the energy system, which 3 semantic scenario(s), infrastructure, energy flow
is intended to be modelled. It further defines the respective 2 information flow variables, information direction
scenario(s), for which this modelling is undertaken. It consists 1 simulators interaction simulation control and communication
0 simulation technology (simulator dependent)
of energy grids and connected devices. In this perspective, the

26
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

or DCP [24] for distributed co-simulation as well as the co- simulation is repeated periodically in a period of typically
simulation frameworks [17], [19], [25]. 15 minutes to one hour.
During the simulation run, the values of the single variables are
transmitted from the output interfaces to the input interfaces
according to the specification in the information flow layer.
B. Splitting the simulation
D. Simulation technology layer
On the bottom layer, each simulation tool has its own To achieve a frictionless fitting of the distributed simulation
simulation technology. To provide the required interface parts, all interfaces are defined before the individual models
for the simulation interaction layer, an intuitive solution are created.
is the implementation of individual adaptions for existing After the scenario is clarified, the interface definition can be
simulation tools by using their program code APIs in the performed according to the interface layer stack introduced
simulation tools (like applied in [17], [18]) or adapting in Section III. On the semantic layer, the energy system is
them otherwise. In [20], a Python wrapper is implemented sketched in the shape of grids and connected components
to adapt commercial simulation tools to their co-simulation like consumers or generators. Energy and material like fuel
framework. is exchanged between the connected components across the
Simulation tools that already provide the widespread FMI grid. From this sketch, the competence responsibilities for
standard can be easily integrated by creating an adaptation components and grids are segregated between the participating
for the FMI standard once as applied in [25] or for a smart experts. The subsequent independent modelling of these com-
grid simulation framework in [26]. ponents in separate simulation modules is the responsibility of
the experts.
In the next step, the correspondingly assigned experts identify
IV. D ISTRIBUTING A S IMULATION the required information transfer on the information flow layer
Verification by simulation is an important part in design- for each energy or material flow. To do this, they draw up a
ing energy systems. For this verification, detailed simulation list of individual interfaces, each consists of a variable with
modules of different domains are needed. For an efficient the corresponding unit. For physical simulations, the variables
modelling of multimodal energy systems by experts of the depend on a real-world system. Modelled representations of
respective domains in parallel, a coordination procedure is these systems have a high similarity. Thus, for the time-step
required. In the present section, the procedure of distributing based simulation of physical systems, a generalization of the
the simulation is introduced. The splitting begins after an interfaces is possible due to the low level modelling of a real
appropriate simulation scenario is determined. Finally the system. A reusable definition of interfaces on this layer for
independently developed simulation models are united in a electricity and heat grids is shown in Section V. In simulations
common co-simulation without any adaption effort. considering communication behaviour or other kinds of event-
based simulation, reusable definitions of interfaces are not
A. Finding an appropriate level of detail feasible in general.
Energy systems can have very different dynamic behaviours. A common used co-simulation platform has to provide the
While dynamic investigations of electricity grids consider time connections between the simulation modules according to
periods of subseconds, the dynamic behaviour of gas and the elaborated list of interfaces. The co-simulation platform
heat grids is much slower. From a technical point of view, will conduct the composed distributed simulation in the end.
it is possible to create a co-simulation respecting all these An important question regarding the choice of the platform
issues. However, the execution of this type of co-simulation is whether the simulation should be performed locally on
requires an enormous amount of computing power. But if one machine or distributed on several machines. Furthermore,
effects occurring in one subsystem have a negligible influence confidentially obligations can necessitate the execution of
on other subsystems, these effects can be excluded from the some simulation models in a geographically restricted area,
co-simulation and investigated in an independent additional which needs a platform supporting co-simulations over large
simulation. For this reason, the level of detail is determined distances.
by the issues to be investigated. The development and simulation tool for each simulation mod-
Seasonal behaviour is an important factor in energy systems ule is individually chosen by each expert, but the compatibility
integration. Investigations usually consider periods of one year between the co-simulation platform and the chosen simulation
or even more (e.g. [6], [12]). The computational executability tool is mandatory. It is meaningful to take this already into
within an acceptable execution time has to be taken into account while choosing the co-simulation platform to avoid
account by every simulation module (see also subsection costly simulation tool adaptations. In [20], the selection of an
IV-C). A typical example for the adjustment of the level of appropriate simulation environment is actually recommended
detail in the simulations is given in this case for the modelling as first step.
of the power grid. To consider large time scales, the electricity The whole procedure for a district combined heat and power
grid is simulated in a steady state power flow simulation. This simulation case is exemplarily applied in Section V.

27
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

C. Independent model development coupled subsystems as described in [27] and accepting the
The great advantage of distributing simulations is the sep- accompanying inaccuracies. For this reason a skilled selection
aration of the simulation models. Different research groups of the subsystem borders is beneficial. If there are unavoidable
can develop and implement simulation models that represent delays in the simulation, it is possible to capitalize them
different systems. The individual models are developed inde- as described in [28]. In energy systems this could become
pendently of each other and differ both in their representation interesting when control systems are considered and the time
and in their runtime environment. They can be represented step size is appropriately small.
as differential equation systems, discrete automata, etc. The Increasing the accuracy by reducing the time step size is possi-
only restriction is the supply of interfaces according to the ble. However, this is accompanied by a higher computational
specified interface list in the information flow layer. Thus, time demand. Hence, the optimal compromise between the
the modelling is done on the level of subsystems without required accuracy and computation time has to be determined
having the coupled problem in mind. Ideally, these models for every application case. The usage of various time step sizes
have been created by experts in their respective field and is discussed in [20].
are properly validated and thus recognized. Since the subdo-
mains involved already use numerous established technologies V. E XEMPLARY A PPLICATION
and simulation tools, distributed simulation enables the easy
reuse of these simulation models. On the other side, these The approach described in the previous sections is demon-
existing models do not need to be adapted to the respective strated in the following using the example of a district sim-
simulator architecture or even reformulated. Each model is ulation with a photovoltaic system, heat storage, electricity
developed on its own platform and in such a way that the storage and mini-CHP (Combined Heat and Power unit). This
later unified distributed simulation is carried out by black test case has been selected as a useful addition in the context of
box operation of the subsystems with minimal computational large transmission grid simulations, which are usually limited
effort. The simulation modules are calculating their delivered to optimization strategies at a higher dispatch level with
output variables in dependence of the incoming input variables long-term optimization. It is assimilable to the energy cell
and the simulation time. In some cases precalculations during simulations of the ESDP [16]. This example can be used to
the model development can reduce the calculation effort during consider effects on the shorter time scale, such as fluctuations
the simulation run significantly. in energy demand and availability, energy costs, maintenance
planning, or storage capacity. Different operating strategies
D. Unifying the distributed simulation can be tested as well as the effects of fluctuations in weather
After the independent modelling is completed, all simula- conditions. Furthermore, universal interface definitions on the
tions are linked together. If the interface specifications are interface flow layer for power grids and for heat grids are
met, no further configuration in the simulation modules is provided.
required. The mapping of the simulation output and inter-
faces according to the specification on the information flow A. Application case
layer as well as further global settings is executed by the
co-simulation framework. When a simulation is started, the A demonstration district with exemplary consumption data
different simulation modules are exchanging information via is simulated with individual system models representing state-
the specified interfaces. At the semantic level, the operation of of-the-art technologies. A power grid model based on real-
the energy system is simulated in order to achieve the specific world data is integrated with a generic heat grid model. A
goals initially set (e.g. low emissions, low costs). photovoltaic installation, a heat storage, an electricity storage
The physical simulation represents a continuous process, and a mini-CHP are connected to each other and to the
which is usually simulated in time steps. Control can also power and heat grid. In this test case the mini-CHP and the
be integrated time-step-based via specified interfaces. Alter- power storage are operated in order to use a high share of
natively, a direct event-based addressing of the controlled the generated electrical energy and photovoltaic feed-in while
units can be independently implemented beyond the phys- simultaneously ensuring the thermal supply.
ical simulation. The co-simulation platform has to ensure
a deadlock-free execution sequence of the simultaneously
running simulations. Heat grid Heat
storage
E. Effects on distributing a simulation
Distributing a simulation is accompanied by accuracy is- fuel Mini- Household Household Household
CHP (with PV) (with PV)
sues. An issue concerns the splitting of the systems behaviour
describing equations. Representing the whole considered sys- external grid
Power
tem in a closed system of equations is unpractical due to Power grid storage
the huge computation effort solving this system. This can
be overcome by splitting the simulated system into weakly Fig. 1. Distributed simulation case on the semantic layer

28
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE II TABLE III


I NTERFACES FOR THE FLUID WATER HEAT GRID AND CONNECTORS I NPUT INTERFACE LIST FOR A GENERIC CHP

Pos. Interface Unit Note Interface name pos. type unit connected to simulator
(1) District heating T_ret (3.1) input K heat grid
(1.1) ambient air temperature K m_flow (3.2) input kg/s heat grid
P_th_set (3.3) input (kW) controller
(2) District heating per connector T_amb (5.1) input K weather
(2.1) water temp. in grid direction1 K
(2.2) mass flow rate2 kg/s
(3) Connector heat source
and heat sink connectors (4) the flow temperature (4.1). For
(3.1) return temperature K heat sources, the mass flow rate (2.2) can alternatively be
(3.2) mass flow rate3 kg/s shifted to the source side (3.2), if the water pump is part of
(3.3) setpoint(s) (data) the grid model.
(4) Connector heat sink Since the interface definition is orientated on the input in-
(4.1) flow temperature K
(4.2) setpoint return temperature (data)
terfaces, the output of the heating grid is listed as the input
interfaces of the connectors (3) and (4). (5) and (6) are
(5) Gas turbine, like (3), additionally
(5.1) ambient air temperature K special instances of (3) and (4). Every heating grid model
requires the ambient air temperature (1.1) once and the input
(6) Heat storage, like (3) + (4)4 ,
additionally
(6.1) ambient air temperature K interfaces (2) once for each linked connector. Thereby, the
design decision is made, that the heating grid model requires
an input interface (2.2) containing mass flow rate information
B. Applying the interface determination procedure for each connector. This decision represents a typical operation
of a district heating grid. Usually, the grid does not have
The interface determination procedure, introduced in Sec- information about the demand at each consumer. Instead,
tion IV is applied to the presented simulation case. First the each consumer increases or decreases the mass flow rate
simulation scenario is split into the responsibilities of the individually. Hence, the demand sets the mass flow rate (2.2).
experts. Figure 1 shows the distribution for the introduced The total mass flow rate of the grid is the sum of all demands
application case. The individual simulation modules are a heat and needs to be supplied by the heating grid model.
grid simulation including a storage and the heat consumers, a Setpoints (printed in italic) are actually not really physical
mini-CHP simulation, photovoltaics and electricity consumer quantities, since they represent only transmitted information.
simulation, electricity storage simulation, and power grid This information consists of simple data without any physical
simulation. Additionally there is a (not depicted) weather sim- quantity. Otherwise, they can be treated as the physical quan-
ulation, which provides ambient air temperature and radiation tity this information is associated with, or they can be excluded
data for the other simulation modules. and treated event-based beyond the physical simulation.
Subsequently, the interfaces on the information flow layer are The exemplarily deduced input interface list for a generic CHP
determined. The district heating grid connects a heat supply is shown in Table III. Next to the interfaces, the connection
to multiple consumers with pipes. For the majority of existing to the associated simulator is noted in the last column. The
district heating, consumers use heat exchangers or direct con- associated simulators have to provide an appropriate output
nections to the grid. The power transport in a heat exchanger is interface. The list of utilized output interfaces is determined
mainly driven by the fluid temperatures and the fluid mass flow from the connection information of the overall input interfaces
rates at primary and secondary side. For the representation of all simulators.
of connections, generalized connector input interfaces for the In the presented simulation case, the water pump is shifted
heating grid are shown in Table II. The interfaces on the from the heat grid simulation module to the mini-CHP simu-
information flow layer regarding the heating grid and its lation module. Following the mass flow input interface has to
connectors can be deduced from Table II. The grid itself needs be moved from the mini-CHP module to the heat grid module.
the ambient air temperature (1.1) as an input. This information In return the mini-CHP module receives an additional setpoint
is processed internally in the heating curve. The heating curve for the flow temperature.
determines the setpoint for the flow temperature in the heating The generalized interface specification for steady state elec-
grid. Furthermore, for every connector of the district heating, tricity grid simulations with slack bus is shown in Table IV.
an input interface describing the mass flow entering the heating The electricity grid receives the active and the reactive power
grid (2.2) and an input interface containing the corresponding for every connected load (1) or generator (2). The only
temperature of the incoming fluid (2.1) is defined. Heat source difference between the input interfaces is the sign of the value
connectors (3) need the return temperature (3.1) from the grid of the active power. While the active power of a load (1.2) is
1 return
represented by a positive value, the active power of generators
temperature for heat sinks, flow temperature for heat sources
2 for sources (2.2) is only used if the pump is placed on the source side
(2.1) is negative. (3), (4), (5) are generators, (6) is an instance
3 (3.2) is only used if the pump is placed on the grid side of a generator as well as of a load.
4 without setpoints in passive operation mode For connecting the specified interfaces on the simulators

29
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE IV Temperature dependent polynomials are used to calculate the


I NTERFACES FOR THE AC POWER GRID AND CONNECTED DEVICES heat capacity, viscosity and thermal conductivity of pure sub-
stances. The behaviour of turbomachinery is calculated based
Pos. Interface Unit Note
on the pressure ratio and the isentropic efficiency, which are
(1) Electricity grid per load
(1.2) active power W ≥0
interpolated from uploaded maps according to the operating
(1.3) reactive power VA point. Heat and pressure losses are taken into account and the
(2) Electricity grid per generator
model is validated with experimental data. The model output
(2.1) active power W ≤0 is delivered as a 5D look-up table, in which the performance
(2.2) reactive power VA of the CHP in terms of fuel consumption and emissions is
(3) Photovoltaic system (incl. inverter5 ) available for different electrical/thermal set points, different
(3.1) direct radiation W/m2 conditions of the heating grid and ambient temperature. The
(3.2) diffuse radiation W/m2
(3.3) ambient temperature K
integration of the 5D look-up table is performed by a Java
(3.4) (date and time6 ) (UTC) implementation using an Octave engine for Java.
(4) Wind power plant
The simulation module simulating the electricity storage is
(4.1) wind speed m/s directly implemented in Java. The characteristics of the as-
(4.2) wind direction rad sociated power electronic of the inverter is integrated in an
(4.3) phase angle set point (data)
exchangeable file.
(5) AC-generator (synchronous) For the calculation of the power flow in the electricity grid, the
(5.1) torque Nm
(5.2) phase angle set point (data) MATPOWER library [32] is used in an Octave implementation
for the co-simulation framework. The photovoltaics injection
(6) Storage
(6.1) power set point (data) R0 depends on the weather (radiation and temperature) and the
time, from which the solar altitude can be derived for a
specified location. To avoid computational effort, the solar
interaction level, the co-simulation platform presented in [25] injection is precalculated and replayed during the simulation
is utilized. run. This is possible because the values of all input interface
dependencies can already be determined before the simulation
C. Simulation Development run. The resulting time series is injected in the co-simulation
All simulation modules are developed independently from framework as well as time series for the temperature and the
each other. The co-simulation platform supports all required loads in the CSV format.
simulation tools. D. Simulation Execution
The district heating grid is an idealized grid modelled in
The simulation run is demonstrated according to the simula-
Modelica®. For the model generation, components of the
tion scenario depicted in Figure 1. The mini-CHP contains the
AixLib library [29] are used. The pipe model was developed in
micro gas turbine and the water pump. The water flow rate is
[30] and is able to simulate thermal losses, pressure losses and
calculated as a function of the thermal power setpoint and the
temperature wave propagation. For the demonstration of co-
flow temperatures determined by the heating grid simulation.
simulation, a simple model is used. The grid supplies three
The available interfaces on the information flow layer for this
buildings, with a combined peak heat demand of 176 kW.
module are shown in Table V.
Furthermore, a thermal energy storage is added. The embed-
The loads of the households are synthetically created in depen-
ding into the co-simulation framework is performed by FMI
dence to weather data supplied by [33] using the tools [34],
export. The combined heat and power unit (CHP) integrated
[35]. Figures 2 to 5 are showing the district energy systems
in the co-simulation is a microturbine based CHP with an AE-
T100/Dürr CPS. With an electrical output of 100 kW and a
TABLE V
thermal output of about 180 kW the CHP is integrated with I NTERFACE LIST FOR THE SIMULATION MODULE CONTAINING THE
a flexible burner, so that the machine can be operated with MINI -CHP AND THE WATER PUMP
various fuels such as natural gas, biogas and synthesis gas with
a calorific value of 7 to 49 MJ/kg. The cogeneration unit is Identifier Description type unit
modelled in a stationary 0-D simulation tool that is suitable T_amb ambient air temperature input K
for fast and robust analysis of complex cycles [31]. The model T_ret return temperature input K
P_th_set thermal power setpoint input (kW)
is implemented in MATLAB®/SIMULINK® and allows the T_flow_set flow temperature setpoint input (K)
modelling of various CHP subcomponents such as turboma- T_flow_real real flow temperature output K
chinery, burner, recuperator, heat exchanger, electric generator m_flow water mass flow output kg/s
P_th_real real thermal power output kW
and intermediate circuit. For the modelling of the compo- P_el electrical power (active) output kW
nents, all gas flows are assumed to be ideal gas mixtures. Q_el electrical power (reactive) output kVA
m_fuel fuel consumption output kg/s
5 reactive CO2 carbon dioxide emissions output kg/s
power supply in dependence to the current active power
6 removed, CO carbon monoxide emissions output kg/s
if the time at simulation start is known

30
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

1 100
360
0.8 80

Temperature in ◦ C
Heat power in kW

150

Temperature in K
State of charge
340
0.6 60
100
320
0.4 40
50
300
0.2 20
0 280
0 0
0 1 2 3 0 1 2 3
Time in days Time in days
Demand CHP P th State of charge T amb T ret T flow real

Fig. 2. Heat power profiles Fig. 4. Mini-CHP temperature profiles

100 1 1 0.03

Emissions and fuel in kg/s


Active power in kW

0.8 0.8

Water flow in kg/s


50
State of charge
0.02
0.6 0.6
0
0.4 0.4
0.01
−50
0.2 0.2

−100 0 0 0
0 1 2 3 0 1 2 3
Time in days Time in days
Demand PV CHP P el m flow CO2 m fuel
Storage Export State of charge
Fig. 5. Mini-CHP water flow and emission profiles
Fig. 3. Electrical power profiles

cooperation between experts from different research areas


behaviour during three exemplary days. The heat profiles and requires systematic coordination. The present contribution
the charge state of the heat storage are shown in Figure 2. The enables the reduction of the coordination effort by defining
operation of gas turbine based mini-CHP has a threshold of a clear approach. It comprises the holistic definition of the
60 kW (thermal). During noon, the mini-CHP is stopped due interfaces between the simulation parts from the semantic
to cheap available electrical power from photovoltaics. In the view of the investigated energy system up to the technical
mean time the heating is supplied by the passive heat storage. view. An unambiguous specification of the interfaces between
Figure 3 shows the consumed and injected power, whereby the simulation parts enables independent modelling by the
the injected power is negative according to Table IV. Figure 4 specialists and subsequently a frictionless fitting of the
shows the temperature behaviour at the connection points of distributed developed simulation models. A generic interface
the gas turbine as well as the ambient air temperature [33]. definition is prepared for the simulation of energy systems
When the mini-CHP is not operating, the water mass flow with district heating and power grids. The application is
stops. Thus, water in the pipes cools down and cold water exemplarily demonstrated for a district simulation with a
enters the mini-CHP after a restart. Figure 5 shows the water photovoltaic system, heat storage, electricity storage and
mass flow trough the mini-CHP and the accompanying natural mini-CHP.
gas fuel consumption and carbon emissions. This procedure enables a smooth model development
by the individual domain experts, the source code—if
VI. C ONCLUSIONS
confidential—does not need to be disclosed, while time-
The verification of multimodal energy system design is consuming adjustments of the simulation models and possible
particularly qualified for distributed simulation. Successful incompatibilities at the end are avoided.

31
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

For the design of energy systems, the verification of [17] C. Molitor, S. Groß, J. Zeitz, and A. Monti, “MESCOS—a multienergy
developed concepts by simulation is an established approach. system cosimulator for city district energy systems,” IEEE Transactions
on Industrial Informatics, vol. 10, no. 4, pp. 2247–2256, Nov. 2014.
The carrying out of the associated investigations including [18] S. R. Drauz, C. Spalthoff, M. Würtenberg, T. M. Kneikse, and M. Braun,
design and operational optimization in interdisciplinary “A modular approach for co-simulations of integrated multi-energy
research cooperations is a future goal. systems: Coupling multi-energy grids in existing environments of grid
planning & operation tools,” in 2018 Workshop on Modeling and
Simulation of Cyber-Physical Energy Systems (MSCPES), 2018, pp. 12–
17.
ACKNOWLEDGEMENT [19] F. Marten, A.-L. Mand, A. Bernard, B. K. Mielsch, and M. Vogt, “Result
This work was funded by the Initiative and Networking processing approaches for large smart grid co-simulations,” Computer
Science - Research and Development, vol. 33, no. 1, pp. 199–205, Feb.
Fund of the Helmholtz Association in the future topic ”Energy 2018.
Systems Integration” under grant number ZT-0002. [20] J. Ruf et al., “Simulation framework for multi-carrier energy systems
with power-to-gas and combined heat and power,” in 2018 53rd Inter-
national Universities Power Engineering Conference (UPEC), 2018, pp.
R EFERENCES 526–531.
[1] P. Ekins, “Step changes for decarbonising the energy system: research [21] R. Egert, A. Tundis, and M. Mühlhäuser, “On the simulation of smart
needs for renewables, energy efficiency and nuclear power,” Energy grid environments,” in Proceedings of the 2019 Summer Simulation
Policy, vol. 32, no. 17, pp. 1891–1904, 2004. Conference, 2019.
[2] T. S. Schmidt, M. Schneider, and V. H. Hoffmann, “Decarbonising the [22] T. Blochwitz et al., “The functional mockup interface for tool inde-
power sector via technological change – differing contributions from pendent exchange of simulation models,” in Proceedings 8th Modelica
heterogeneous firms,” Energy Policy, vol. 43, pp. 466–479, 2012. Conference, Dresden, Germany, March 20-22, 2011, 2011.
[3] J.-F. Mercure et al., “The dynamics of technology diffusion and the [23] T. Blochwitz et al., “Functional mockup interface 2.0: The standard for
impacts of climate policy instruments in the decarbonisation of the tool independent exchange of simulation models,” in Proceedings of the
global electricity sector,” Energy Policy, vol. 73, pp. 686–700, 2014. 9th International Modelica Conference, Munich, Germany, September
[4] T. Gerres, J. Ávila, P. Llamas, and T. San Román, “A review of 3-5, 2012, 2012.
cross-sector decarbonisation potentials in the european energy intensive [24] M. Krammer et al., “The distributed co-simulation protocol for the
industry,” Journal of cleaner production, vol. 210, pp. 585–601, 2019. integration of real-time systems and simulation environments,” in Pro-
[5] E. Lachapelle, R. MacNeil, and M. Paterson, “The political economy of ceedings of the 50th Computer Simulation Conference, 2018.
decarbonisation: from green energy ‘race’ to green ‘division of labour’,” [25] A. Erdmann, H. K. Çakmak, U. Kühnapfel, and V. Hagenmeyer, “A
New Political Economy, vol. 22, no. 3, pp. 311–327, 2017. new communication concept for efficient configuration of energy sys-
[6] T. Brown, D. Schlachtberger, A. Kies, S. Schramm, and M. Greiner, tems integration co-simulation,” in 2019 IEEE/ACM 23rd International
“Synergies of sector coupling and transmission reinforcement in a cost- Symposium on Distributed Simulation and Real Time Applications (DS-
optimised, highly renewable European energy system,” Energy, vol. 160, RT), 2019, pp. 235–242.
pp. 720 – 739, 2018. [26] S. Rohjans, E. Widl, W. Müller, S. Schütte, and S. Lehnhoff, “Gekop-
[7] C. Müller et al., “Integrated planning and evaluation of multi-modal pelte Simulation komplexer Energiesysteme mittels MOSAIK und FMI,”
energy systems for decarbonization of Germany,” Energy Procedia, vol. at – Automatisierungstechnik, vol. 62, no. 5, pp. 325–336, 2014.
158, pp. 3482–3487, 2019. [27] P. Palensky, A. A. Van Der Meer, C. D. López, A. Joseph, and K. Pan,
[8] P. Capros et al., “European decarbonisation pathways under alternative “Cosimulation of intelligent power systems: Fundamentals, software
technological and policy choices: A multi-model analysis,” Energy architecture, numerics, and coupling,” IEEE Industrial Electronics Mag-
Strategy Reviews, vol. 2, no. 3-4, pp. 231–245, 2014. azine, vol. 11, no. 1, pp. 34–50, Mar. 2017.
[9] J. Després, N. Hadjsaid, P. Criqui, and I. Noirot, “Modelling the impacts [28] C. Michel and P. Siron, “Delay-based distribution and optimization of a
of variable renewable sources on the power sector: Reconsidering the simulation model,” in 2018 IEEE/ACM 22nd International Symposium
typology of energy modelling tools,” Energy, vol. 80, pp. 486–495, 2015. on Distributed Simulation and Real Time Applications (DS-RT), 2018,
[10] M. Zimmerlin, F. Mueller, M. Wilferth, L. Held, M. R. Suriyah, and pp. 21–28.
T. Leibfried, “Mixed integer nonlinear optimization of coupled power [29] D. Müller, M. Lauster, A. Constantin, M. Fuchs, and P. Remmen,
and gas distribution network operation,” in 2018 53rd International “AixLib – an open-source Modelica library within the IEA-EBC An-
Universities Power Engineering Conference (UPEC), 2018, pp. 257– nex 60 framework,” in Proceedings of the CESBP Central European
262. Symposium on Building Physics and BauSIM 2016, 2016, pp. 3–9.
[11] L. Andresen, P. Dubucq, R. Peniche Garcia, G. Ackermann, A. Kather, [30] B. van der Heijde et al., “Dynamic equation-based thermo-hydraulic pipe
and G. Schmitz, “Status of the TransiEnt library: Transient simulation model for district heating and cooling systems,” Energy Conversion and
of coupled energy networks with high share of renewable energy,” in Management, vol. 151, pp. 158–169, 2017.
Proceedings of the 11th International Modelica Conference, Versailles, [31] T. Krummrein, M. Henke, and P. Kutne, “A highly flexible approach on
France, September 21-23, 2015, no. 118, 2015, pp. 695–705. the steady-state analysis of innovative micro gas turbine cycles,” Journal
[12] S. Clegg and P. Mancarella, “Storing renewables in the gas network: of Engineering for Gas Turbines and Power, vol. 140, no. 12, Dec. 2018.
Modelling of power-to-gas seasonal storage flexibility in low-carbon [32] R. D. Zimmerman, C. E. Murillo-Sánchez, and R. J. Thomas, “MAT-
power systems,” IET Generation, Transmission & Distribution, vol. 10, POWER: Steady-state operations, planning, and analysis tools for power
pp. 566–575, Feb. 2016. systems research and education,” IEEE Transactions on Power Systems,
[13] M. Geimer, T. Krüger, and P. Linsel, “Co-Simulation, gekoppelte Simu- vol. 26, no. 1, pp. 12–19, Feb. 2011.
lation oder Simulatorkopplung?” O+P Ölhydraulik und Pneumatik, no. [33] DWD Climate Data Center (CDC), “Historical 10-minute station obser-
11-12, pp. 572–576, 2006. vations of pressure, air temperature (at 5cm and 2m height), humidity,
[14] F. Schloegl, S. Rohjans, S. Lehnhoff, J. Velasquez, C. Steinbrink, dew point, solar incoming radiation, longwave downward radiation,
and P. Palensky, “Towards a classification scheme for co-simulation sunshine duration, mean wind speed and wind direction for Germany,
approaches in energy systems,” in 2015 International Symposium on version V1,” last accessed: May 26th, 2020.
Smart Electric Distribution Systems and Technologies (EDST), 2015, [34] P. Remmen, M. Lauster, M. Mans, M. Fuchs, T. Osterhage, and
pp. 516–521. D. Müller, “TEASER: an open tool for urban energy modelling of
[15] C. Steinbrink et al., “Simulation-based validation of smart grids–status building stocks,” Journal of Building Performance Simulation, vol. 11,
quo and future research trends,” in International Conference on Indus- no. 1, pp. 84–98, 2018.
trial Applications of Holonic and Multi-Agent Systems, 2017, pp. 171– [35] N. Pflugradt and U. Muntwyler, “Synthesizing residential load profiles
185. using behavior simulation,” Energy Procedia, vol. 122, pp. 655 – 660,
[16] S. Raths et al., “The energy system development plan (ESDP),” in 2017, cISBAT 2017 International Conference – Future Buildings &
International ETG Congress 2015; Die Energiewende – Blueprints for Districts – Energy Efficiency from Nano to Urban Scale.
the new energy age, 2015, pp. 267–274.

32
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Bio-Inspired Drones Recruiting Strategy for


Precision Agriculture Domain
Mauro Tropea, Abdon Serianni
DIMES department, University of Calabria
87036 Rende (CS), Italy
mtropea, a.serianni@dimes.unical.it

Abstract—UAV stands for Unmanned Aerial Vehicle and it For that concern precision agriculture, these new flying
is a flying device characterized by the absence of the pilot on devices allow to follow the plants growth intervening in
board. Its support in many situations and different applications cases of parasites infection. Also, they are able to fly at
is able to relieve the human operator thanks to its capacity of
having a rapid deployment and quickly performing its action. specific height in order to acquire images without interfere
In order to make its tasks, every UAV/Drone is equipped with a with satellites, [3]. [4].
set of on-board sensors specific for each task. One of different The possibility of creating team of UAVs/Drones is a big
applicative fields is the precision agriculture, where, thanks to advantage in this application domain. This because UAVs can
the possibility of equipping the UAV with on-board cameras, collaborate together in order to reach the prefixed task consid-
it is able to perform detailed analysis of the health status of
the plants intervening suddenly if it is needed. In this paper, ering that they are equipped with a limited amount of energy
coordination protocols applied to the problem of controlling a for battery life and limited amount of, for example, pesticide in
swarm of UAVs/Drones against parasites attacks to crops has been that case in which farmers have to fight against parasites. Then,
analysed, studying different approaches in order to measure their the cooperation between these devices represents an important
performance and costs. One of the problems with these devices aspect and, it is important to study the coordination techniques
concerns the limited quantities of fuel and pesticide. A possible
approach to this issue is asking for help to other UAVs/Drones in able to create group of UAVs that collaborate together, paying
order to destroy completely the parasites. The idea is to apply the attention to energy saving [5] and energy harvesting [7], [8].
concepts of bio-inspired approach to the recruitment protocols There are other important topics that are object of research in
providing performance evaluation in order to give the goodness the scientific community about UAVs/Drones. A very studied
of the proposal. topic regards the possibility of providing coverage in particular
Index Terms—UAV, Drone, Precision Agriculture, Coordina-
tion Protocol, Bio-Inspired scenario (such as emergency situations) [6], and, then, the
bandwidth management performing mobility prediction of the
I. I NTRODUCTION users and the opportune admission call [9], [10] preserving
Precision agriculture was born in the United States of packet loss [11], [12] in the network. This type of devices can
America in the early nineties and the name comes from the be utilized in cooperation with Satellite platforms [13], [14]
English Precision Agriculture or Precision Farming or Site in order to give a more ubiquitous connection or with VANET
Specific Farming Management. Its birth and evolution have network in order to give support to networks of vehicles [15].
been favored and supported by the potential deriving from Also, different routing techniques are possible to use in these
the widespread application of new technical solutions to the new networks based on new approaches such as opportunistic
primary sector. This practice consists in applying technologies, mechanism [16], [17]. In this paper, a comparison between two
principles and strategies for spatial and temporal management recruiting protocols are evaluated: a classical flooding mecha-
of the variability associated with aspects of agricultural pro- nism and a bio-inspired approach. After explaining briefly the
duction, in relation to the real needs of the plot [1]. protocol functionality, the results obtained compared the two
The application of this innovative approach requires an in- approaches in the Precision Agriculture domain are presented.
depth knowledge of the physical, chemical and biological The paper is organized in the following way: in Section II
characteristics of the fields, their mapping and storage so that related works are presented; an high level panoramic about
they can then be managed by computer control of the crop protocols for coordination is shown in Section III; two dif-
operations, placed on board the machines [2]. The environ- ferent coordination strategies with comparison are detailed in
mental benefits derive from a more targeted use of chemicals, Section IV; finally, conclusions are presented section V.
better efficiency or, in the case of pesticides, the reduction of II. R ELATED W ORK
the development of resistance to various active ingredients.
All this has effects on water quality and the reduction of In the following, some works that deal with coordination
its consumption, on the quality of soil and air, on climate issues in UAVs platforms are shown.
mitigation and on the energy issue. In [18], authors deal with coordination movement of swarms
of UAVs/Drones using mobile networks. In [19], authors
978-1-7281-7343-6/20/$31.00 ©2020 IEEE propose a biologically-inspired mechanisms to coordinate

33
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

UAVs performing target search with imperfect sensors. In Route Replay (RRep) message is sent back towards the source
[20], author proposal is based on stigmergy approach that when it arrives on a node that knows the destination. This
previews of depositing digital pheromone on those locations behavior is depicted in the figure below (Fig.1) through the
where UAVs sense potential target. In [21], authors propose time epochs T1, T2, and T3.
a control strategy for a group of UAVs through the use of
a bio-inspired approach for creating a robust control and
coordination strategy. The paper [22] proposes a solution for
the non-linear problem of the constraints optimization showing
the UAV motion coordination in which a reference UAV can
be seen as a leader in the group.

III. ROUTING P ROTOCOLS FOR C OORDINATION


T ECHNIQUES
The coordination techniques allow to organize UAVs in
groups, forming a so called Flying Ad-hoc NETwork (FANET)
[23], in order to perform together a specific task exploiting
messages exchange between each other. In the following, a
classification of the routing protocols is given in order to show
the different approaches that it is possible to follow.

A. Routing Protocols Classification


Routing protocol can be classified in the following way:
Proactive: Devices exchange packets on periodical basis,
Fig. 1. Messages exchange in a reactive protocol
updating the routing table of each node. This permits of
having always updated routing information despite the large
Hybrid: A protocols family able to join the proactive and
amount of messages in the network. Proactive protocols have
reactive positive features is called ”Hybrid”. Examples are:
the following characteristics:
Location-Aided Routing (LAR) and Zone Routing Protocol
• Exchange of packets at fixed intervals: Proactive protocols (ZRP) [24].
have the property of allowing the components of the Hierarchical: This family is able to reduce overhead of
network to have the most up-to-date routing informa- proactive protocols by grouping nodes into classes inserted in
tion available. This mechanism is possible because the the tables. In particular, the network is divided into clusters,
devices, without any necessary request, exchange infor- in which a cluster leader is elected obtaining a centralized
mation packets between them. The packets reveal both structure to implement more scalable protocols. Within the
the routing information and the topological changes in cluster, proactive techniques are used, while for inter-cluster
the network. communication, reactive techniques are used, so to allow of
• Use of tables: The use of one or more tables that store all reaching the destination cluster.
information regarding the network topology is a peculiar Location based: Based on GPS information, another proto-
characteristic of the Proactive class. cols family is able to perform routing according to the position
• Updating of tables: Updating routing tables is a funda-
of the searched node. Routing is optimized so that it occurs
mental quality of this class of protocols, which means in a certain area (Routing Zone) that contains the Expected
that users who choose these protocols can have a large Zone, which is the area in which it is reasonable to think of
network available. finding the destination node.
Reactive: Reactive protocols main feature is to make paths Power-aware: In the last protocols family, the nodes are
among nodes on-demand: an important feature for very dy- aware of the limitation of their energy resources and this
namic networks such as FANET. Reactive protocols send a information is used to decide when to be switched on or off,
packet from a source to a destination in the following way: or to choose the least expensive path from a power point of
• Route discovery: Used for discovering several paths be- view. A protocol example is: PARO (Power-Aware Routing
tween source and destination node; Optimized) [25].
• Route maintenance;
• Route deletion. IV. C OORDINATION S TRATEGIES
A generic reactive protocol behavior is shown in Fig. 1: a UAVs recruitment is used in many situations, such as in
Route Request (RReq) message is sent by source node (S in those cases where help is needed from other UAVs. This
the figure) to its neighbors to discover the best path towards mechanism is performed by using different approaches. In this
destination. This message is forwarded by other nodes if they section, two different techniques are explained: flooding base
miss information to reach destination (C in the figure). A and bio-inspired base recruitment.

34
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

A. Flooding base recruitment killed parasites and protocols overhead metrics. In order to
When a UAV needs help, it sends a broadcast message perform these comparative analysis, a UAV simulator designed
starting a timeout and awaiting that a UAV in the Wi-Fi for evaluating UAV performance in Precision Agriculture
range receives this message and, if available, sends a reply. domain has been used [26].
At the timeout expiration, the requesting UAV, considering In Fig.2, the number of killed parasites is shown. It is
the replies, chooses the UAV with the maximum pesticide possible to view how the bio-inspired approach is able to
level, or, for equal pesticide, with maximum energy level, find an higher number of parasites in respect to the reactive
or, for equal pesticide and energy, nearest UAV. In case of flooding mechanism. This is due to the capacity of bio-inspired
equal pesticide, energy and distance, the choice is for the first approach of performing a better UAV recruitment thanks
answering UAV. So, the UAV that needs help sends a recruiting to FANT and BANT messages, differently from Reactive
message towards the chosen UAV returning to the recharging Flooding that floods messages towards the overall networks.
base. If the UAV does not receive response to its help request In Fig.3 it is possible to view the comparison between the
it stores its coordinates in order to come back after recharging. two approaches in terms of number of recruiting requests. As
mentioned previously, bio-inspired approach sends an higher
B. Bio-inspired recruitment number of UAVs recruiting requests. This means an higher
The bio-inspired technique is based on an Ant Colony number of UAVs recruiting and, then, a greater number of
Optimization (ACO) approach for performing recruitment: killed parasites in the considered area.
each node, periodically, sends Forward ANT (Fant) messages. In Fig.4 it is possible to view the trend of the consumed
The probability of sending the FANT message from node i to energy in both approaches. The bio-inspired technique, as it is
node j is performed on the basis of this formula: possible to observe in the figure, is more energy consuming in
respect of the reactive flooding approach. The reactive flooding
α
πi,j · βi,j sends a lesser quantity of data in respect of the bio-inspired
pi,j = P (1)
α · β
πi,k
k∈K i,k approach, then its energy consumption is lesser but it results
in a lower number of killed parasites.
where πij is the pheromone of the entry node in the table
in which the destination is the destination to be reached and
the next hop is the node j, i,j is the local heuristics on
the connection between node i and node j represented by a
random number, α is the incidence of the pheromone on the
choice, β is the incidence of heuristics on the choice, K is
the set of nodes, with distance equal to 1 hop from i, that
are able to reach the destination. Once the FANT reaches the
destination node, this last one sends a new message called
Backward ANT (BANT) on the reverse path.
The reinforcement of the pheromone for a given destination
takes place at the BANT packet crossing. In particular, at the
BANT crossing the pheromone is strengthened in the entry of
the routing table with destination equal to the node sending
the packet and next hop equal to the node from which it is
receiving the packet.
Evaporation of the pheromone occurs periodically as fol-
lows (to manage those paths no longer crossed by packets):
Fig. 2. Number of killed parasites comparison
πi,j = (1 − ρ) · πi,j−1 (2)
where πi,j−1 is the pheromone present before evaporation
in the routing table of node i towards a known destination V. C ONCLUSION
passing through the next hop j, ρ is the evaporation coefficient
of the pheromone and it is a value between 0 and 1. This paper presents a comparative analysis between two
different recruiting approaches for coordinating UAVs in a
C. Recruiting Protocol Comparison Precision Agriculture domain in the fight against parasites.
Differently from previous work [27], where a comparative It has been used a simulator specifically designed for this
analysis between a reactive flooding versus a link state ap- applicative context. The simulation results showed that the bio-
proach in a precision agriculture domain has been presented, inspired approach performs better than reactive flooding one
in this contribution, a comparison between a classical flooding being able to kill a great number of parasites and to exploit
mechanism with a recruiting protocol based on bio-inspired better the recruitment of other UAVs, even if it presents a
approach has been evaluated considering consumed energy, drawback: a greater energy consumption.

35
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

[7] Nguyen, T. N., Duy, T. T., Luu, G.-T., Tran, P. T., and Voznak, M.,
“Energy harvesting-based spectru maccess with incremental cooperation,
relay selection and hardware noises,” (2017).
[8] Nguyen, H.-S., Bui, A.-H., Do, D.-T., and Voznak, M., “Imperfect
channel state information of af and df energy harvesting cooperative
networks,”China Communications 13(10), pp.11–19, (2016).
[9] Fazio, P., Tropea, M., Sottile, C., Marano, S., Voznak, M., and Strangis,
F., “Mobility prediction in wireless cellular networks for the optimization
of call admission control schemes,” IEEE 27th Canadian Conference on
Electrical and Computer Engineering (CCECE), pp.1–5, (2014).
[10] Fazio, P., Tropea, M., Veltri, F., and Marano, S., “A novel rate adaptation
scheme for dynamic bandwidth management in wireless networks,”
IEEE 75th Vehicular Technology Conference (VTC Spring) , pp.1–5,
(May 2012).
[11] Frnda, J., Voznak, M., and Sevcik, L., “Impact of packet loss
and delay variation on the quality of real-time video stream-
ing,”Telecommunication Systems 62, pp.265–275, (Jun 2016).
[12] Voznak, M., Kovac, A., and Halas, M., “Effective packet loss estimation
on voip jitter buffer,” International Conference on Research in Network-
ing, Springer, Berlin, Heidelberg., pp. 157-162, (2012)
[13] De Rango, F., Tropea, M., Santamaria, A. F., and Marano, S., “An
enhanced qos cbt multicast routing protocol based on genetic algorithm
Fig. 3. Number of recruiting request comaprison in a hybrid hap-satellite system,”Comput. Commun. 30, pp.3126–3143,
(Nov. 2007).
[14] De Rango, F., Tropea, M., and Marano, S., “Integrated services on high
altitude platform: Receiver driven smart selection of hap-geo satellite
wireless access segment and performance evaluation,” International
Journal of Wireless Information Networks 13, pp.77–94, (Jan 2006).
[15] Fazio, P., Tropea, M., Sottile, C., and Lupia, A., ”Vehicular networking
and channel modeling: a new Markovian approach”, 12th Annual
IEEE Consumer Communications and Networking Conference (CCNC),
pp.702-707, (Jan 2015).
[16] Socievole, A., De Rango, F., and Coscarella, C., ”Routing approaches
and performance evaluation in delay tolerant networks”, Wireless
Telecommunications Symposium (WTS), pp.1-6, (April 2011).
[17] Socievole, A., Yoneki, E., De Rango, F., Crowcroft, J., ”Opportunistic
message routing using multi-layer social networks”, Proceedings of the
2nd ACM workshop on High performance mobile opportunistic systems,
pp.39-46, (Nov 2013).
[18] de Souza, B. J. O. and Endler, M., “Coordinating movement within
swarms of uavs through mobile net-works,” IEEE International Confer-
ence on Pervasive Computing and Communication Workshops(PerCom
Workshops)], pp.154–159, (2015).
[19] Alfeo, A. L., Cimino, M. G., De Francesco, N., Lazzeri, A., Lega, M.,
and Vaglini, G., “Swarm coordination of mini-uavs for target search
using imperfect sensors,” Intelligent Decision Technologies (Preprint),
pp.1–14, (2018).
Fig. 4. Consumed energy comparison [20] Cimino, M. G., Lazzeri, A., and Vaglini, G., “Combining stigmergic
and flocking behaviors to coordinate swarms of drones performing
target search,” 6th International Conference on Information, Intelligence,
Systems and Applications (IISA), pp.1–6, IEEE (2015).
R EFERENCES [21] Zelenka, J. and Kasanicky, T., “Outdoor uav control and coordination
[1] Pierce, F. J. and Nowak, P., “Aspects of precision agriculture,” Advances system supported by biological inspired method,” 23rd International
in agronomy, 67, pp.1–85, Elsevier (1999). Conference on Robotics in Alpe-Adria-Danube Region (RAAD), pp.1–7,
[2] Faiçal, B. S., Costa, F. G., Pessin, G., Ueyama, J., Freitas, H., Colombo, IEEE (2014).
A., ... and Braun, T., ”The use of unmanned aerial vehicles and [22] Meng, W., Xie, L., and Xiao, W., “Communication aware uav motion
wireless sensor networks for spraying pesticides”, Journal of Systems coordination for source localization and tracking,” Proceedings of the
Architecture, 60(4), pp.393-404, (2014). 32nd Chinese Control Conference, pp.7451–7455, IEEE (2013).
[3] Primicerio, J., Di Gennaro, S. F., Fiorillo, E., Genesio, L., Lugato, E., [23] Bekmezci, I., Sahingoz, O. K., and Temel, S ., “Flying ad-hoc networks
Matese, A., and Vaccari, F. P., “A flexible unmanned aerial vehicle for (fanets): A survey, ”Ad Hoc Networks 11(3), pp.1254–1270, (2013).
precision agriculture,”Precision Agriculture 13(4), pp.517–523, (2012). [24] Husain, A. and Sharma, S. C., “Comparative analysis of location
[4] Pederi, Y. and Cheporniuk, H., “Unmanned aerial vehicles and new and zone based routing in vanet with ieee 802.11p in city scenario,”
technological methods of monitoring and crop protection in precision International Conference on Advances in Computer Engineering and
agriculture,” IEEE International Conference Actual Problems of Un- Applications, pp.294–299, (March 2015).
manned Aerial Vehicles Developments (APUAVD), pp.298–301, (2015) [25] Jung, E. S., and Vaidya, N. H. ”Power aware routing using power
[5] De Rango, F. and Tropea, M., “Swarm intelligence based energy saving control in ad hoc networks”, ACM SIGMOBILE Mobile Computing
and load balancing in wireless adhoc networks,” Proceedings of the 2009 and Communications Review, 9(3), pp.7-18, (2005).
workshop on Bio-inspired algorithms for distributed systems, pp.77–84, [26] De Rango, F., Palmieri, N., Tropea, M., and Potrino, G., “Uavs team and
ACM (2009). its application in agriculture: A simulation environment.,” SIMULTECH
[6] De Rango, F., Tropea, M., Fazio, P., ”Bio-inspired routing over FANET 2017, pp.374–379, (2017).
in emergency situations to support multimedia traffic”, Proceedings [27] Tropea, M., Santamaria, A. F., De Rango, F., and Potrino, G., “Reactive
of the ACM MobiHoc workshop on innovative aerial communication flooding versus link state routing for fanet in precision agriculture,” 16th
solutions for FIrst REsponders network in emergency scenarios, pp.12- IEEE Annual Consumer Communications & Networking Conference
17, (July 2019). (CCNC), pp.1–6, (2019).

36
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

37
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

38
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

39
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

40
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Real-Time Simulation of Robot Swarms with


Restricted Communication Skills
Alexander Puzicha Peter Buchholz
Informatik 4, TU Dortmund Informatik 4, TU Dortmund
D-44221 Dortmund D-44221 Dortmund
alexander.puzicha@cs.tu-dortmund.de peter.buchholz@cs.tu-dortmund.de

Abstract—The paper presents a new approach and a related Contribution of the Paper: There exists the broadly used
software environment for the parallel simulation of swarms of simulation tool Gazebo [2] of the Robot Operating System
autonomous robots in real time. The software environment has (ROS) framework [3]. It is used for Defense Advanced Re-
been developed for model based analysis of algorithms to control
large swarms of distributed autonomous mobile robots com- search Projects Agency (DARPA) [4] and National Aeronau-
municating over an unreliable and capacity restricted wireless tics and Space Administration (NASA) challenges in robotics,
network. It includes a physical simulation of static obstacles, dy- but currently lacks the capability to simulate larger swarms,
namic obstacles with scriptable movement, soil condition, active physical transmission influences and huge areas of up to
jammers, static and dynamic link obstacles with configurable 25 km2 in real time. It is more focused on a broad range
damping as well as noise floors. The simulated ground based
mobile robots use control particle belief propagation (C-PBP) of different robotics applications and supports many different
as a randomized and sample based model predictive closed systems and sensors. In contrast, the tool presented in this
loop controller in combination with cost functions to evaluate paper is intended as an environment for the development and
the situations. We emphasize where the use of shared memory analysis of control algorithms for swarms and makes use
parallelism is beneficial and which inaccuracies in computations of parallelization techniques for selected aspects of physics
are acceptable to increase performance without losing realism.
Index Terms—Real-Time Simulations, Parallel Simulations, and control technology. On the other side, we focus on
Simulation-Based Virtual Environments, Swarm Robotics realistic assumptions with dynamics in a continuous world,
which differs significantly from previous research on swarm
I. I NTRODUCTION intelligence in grid based worlds (see [5] and [6]). Currently
We present a novel control algorithm technique for robot three basic missions, which can be combined in any way, are
swarms to establish mobile ad hoc networks in disaster areas implemented. The first two missions are the exploration of
and validate it with a real-time simulation based on radio dynamic unknown areas and the surveillance, as well as escort
signal propagation physics. In application areas like industrial of vehicles with given distance. The last mission is establishing
production, disaster management or exploration of unknown a Mobile Ad Hoc Network (MANET) to support rescue teams.
terrains, autonomous robots are increasingly important [1]. Structure of the Paper: In the second Section we present the
Usually, swarms of robots solve together some problem or basic concept of control theory which the autonomous agents
perform some tasks. Especially in disaster situation, e.g. after use to find their own goal with respect to the behavior of the
an earthquake or a serious accident in industry, tasks that agents in vicinity. Afterwards, the main parts of the simulation
are dangerous for humans, like terrain exploration, rescue and its mathematical background will be outlined. Section IV
operations or setting up a communication network are ex- explains how the previous simulation parts act together and
tremely important and have to be done as soon as possible. synchronize as efficiently as possible, still keeping a realistic
Robot swarms are in principle well suited to perform these and correct view on the entire system. Then, detailed examples
operations. Ideally, a human operator defines the general goal and their analysis follow, which lead to the conclusions in
of the mission and the robots perform their individual tasks on Section VI.
their own, which implies that they have to synchronize during
operation and have to perform decisions autonomously often II. M ODEL P REDICTIVE C ONTROL OF ROBOTS
with limited knowledge about their environment and the state Each robot agent is controlled by its own model predic-
of other robots in the swarm. tive controller (MPC) [7], which creates trajectories based
The outlined scenario is challenging because the control on information gathered from the sensors of the robot and
software of the robots has to be tested thoroughly which acquired via communication with other agents. The MPC
usually cannot be done in the real environment after the consists of three main parts (see Fig. 1). At first, a model
disaster. Thus, virtual environments have to be built and the of the controllable part of the system is needed, which is
control software has to be deployed in these environments. the robot itself. It is used to estimate and predict system
This means a real-time simulation environment is necessary states. These states are rated through absolute cost functions
to provide realistic test conditions. aggregated to a cost value, that represents the usefulness of

978-1-7281-7343-6/20/$31.00 ©2020 IEEE


41
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the system state in space and time. The cost functions and lsocial (|d|)
constraints depend on the mission targets, configurations made
by an operator and on the states of the other robots. The
third part is the optimization, which minimizes the cost value 20
by generating new trajectories with the help of the model.
After the solution has been found, the corresponding control
vector is set as an input of the real system. The evaluation
of the dynamic environment inside the cost function and its 10
influence on the robots is classified into three basic categories.

costs
The first category contains all static obstacles such as walls,
barriers and obstructions. Those objects can be detected by
the robots or can be preloaded based on detailed maps. The 0
second group consists of dynamic obstacles that can move
around. They are separated into predictable obstacles, like
other robots or vehicles inside a communication group, and
−10
obstacles with unknown behavior. The last category contains
tiles of explored area that describe the accessibility of the −30 −20 −10 0 10 20 30
terrain. As an optimizer, the control particle belief propagation
(C-PBP) [8] algorithm is used. It combines parallel locally d
refined guided random walkers using discrete sampling of cost Fig. 2. Robot social function with desired minimum costs Lmin = −10,
functions and knowledge transfer between optimization steps. maximum costs Lmax = 20 and a desired distance ddesired = 10 for the
minimum (see [11])
actuator
mission & communication
disturbance
MPC environment: static obstacles, dynamic obstacles and terrain
change of accessibility. Most of the objects in the first category can
cost function reward action environemt
and constraints
optimization system be represented by locally limited distance based functions to
plan prevent collisions:
model sensor p
trajectory measurement d = kpk − xk k2D = (∆x)2 + (∆y)2 (1)
 0  
xk xk
Fig. 1. Four components of the robot: communication module, model
predictive controller (MPC), actuator and sensor with pk =  yk0  , xk =  yk 
θk0 θk
III. A V IRTUAL E NVIRONMENT FOR ROBOT S WARMS
The model predictive control core as well as the object han- Let d ∈ R+ 0 denote the distance between an object position
dling structures and communication interfaces are written as pk ∈ R3 and a system state xk ∈ R3 for a discrete time step
an independent kernel. Hence, they can be used in simulations k. Each position and system state consist of a two dimensional
and can be deployed on real robots. However, this paper fo- position (xk , yk ) and an orientation θk . These functions can
cuses on the simulation and evaluation of virtual environments represent walls, trees, lakes, rivers and social behavior (see [9]
and robot swarms to test swarm control algorithms. Therefore, and Fig. 2). More complex objects like surveillance towers
we describe the four main parts of the simulation. with bounded angle of view need the whole system state
because, based on the orientation and range of view, the view
A. Simulation Of The Environment direction and the field of view have to be calculated. Then,
The environment is represented by a set of cost functions. In the robot maps a special cost function to this area to avoid
a completely unknown environment there exist in principle a entering it [10].
global minimum as desired working state and often an optimal Vehicles and time-dependent structures belong to the second
path towards it, but both are unknown. Thus, a rating relative category. Nevertheless, independently of their behavior, robots
to the optimum is impossible. Consequently, the environment only recognize cooperative objects that transmit their planned
will be modeled with absolute costs referring to time and movement or behavior, for example, as trajectory or as ve-
space. Due to the use of C-PBP, there are almost no restrictions locity vector and non-cooperative objects which are treated as
concerning these functions. They just have to be defined on the unknown static objects. As a possible improvement, movement
entire area, but in the worst case, they may depend on every estimation can be implemented for those objects. We have to
parameter and randomness which can cause a high complexity. distinguish between the view of the simulation which captures
Additionally, it has to be taken into account that the actual the complete system state and the information that can be
cost function and the representation, which the robot has, gathered from the sensors of a robot, which is much more
usually differ. The robots have three categories to evaluate the limited.

42
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Equation 2 presents the cost function for the last category. disaster areas. Thus, it is mandatory to predict the connection
It depends on the control vector uk and the system state xk−1 status and quality. Each message gets an unique identification
based terrain accessibility G(xk−1 ). number to enable a message filter and lightly modified Lam-
port clocks [12] are used to create a logical as well as temporal
lmovement (xk−1 , uk ) = vk2 · G(xk−1 ) · ρT + θ̇k2 · G(xk−1 ) · ρR
order on the messages.
(2) Signals are transmitted via Orthogonal Frequency-Division
" # Multiplexing (OFDM) like Long-Term-Evolution (LTE) does,
vk and are divided into subcarriers. Hence, a channel width of
with uk =
θ̇k ∆fc , a carrier distance of ∆ft and protection distances can
be specified to calculate the subcarriers. The carrier distance
Let v and θ̇ denote the translation and rotation velocity of the determines the symbol time [13]:
robot respectively, and ρT and ρR are parameters to adjust the
1
influence for different types of robots. Hence, the Equation ts = (4)
describes how much effort/energy is needed to move on the ∆ft
current terrain. For example, movement on streets requires less Each subcarrier is modulated with one symbol per symbol
energy than movement on mud or rocks. This has an influence time. The symbol width wsymbol depends on the modulation
on the battery capacity of the robot and therefore also on method. The simulation environment supports Quadrature
the decision how to continue the mission. This information Phase-Shift Keying (QPSK), 16 Quadrature Amplitude Mod-
is based on a tile map of the terrain and affects only the ulation (QAM), 64 QAM and 256 QAM. To increase band-
movement. width, Multiple-Input Multiple-Output (MIMO) [14] methods
are available. Based on the maximum the Received Signal
B. Simulation of Robot Dynamics Strength Index (RSSI) and the Signal to Noise Ration (SNR)
A nonholonomic system model is used as dynamic model are used to estimate real transmission rate.
for the state prediction inside the MPC. This model covers c
RSSI = Pt + Gt + Gr + 20 · log10 ( ) (5)
tracked vehicles as well as walking mobile robots. | {z } f · 4π · d
  transmission power | {z }
xk free space damping
xk = fk (xk−1 , uk ) =  yk  (3) i
X
i
θk − Pobstacle
| {z } | {z }
xk obstacle damping in line of sight
   
xk−1 vk · sin(θk−1 ) " # The extended Friis Equation 5 [15] calculates the RSSI of
vk
=  yk−1  + vk · cos(θk−1 ) · ∆k , with uk = the receiver in decibel based on one milliwatt (dBm). Let Pt
 
θ̇k
θk−1 θ̇k denote the sending power, Gt and GR denote the antenna
| {z }
xk−1
gain of the sender and receiver, respectively. The transmission
power is reduced by free space damping, which depends on
Let fk denote in general a discrete nonlinear map that maps the the speed of light c, the sending frequency f and the distance
current system state xk−1 and control vector uk to a successor d between the stations. For calculating the SNR, a noise floor
state xk . The system state xk consists of a two-dimensional power Pnoise has to be specified, measured or calculated by
position (xk , yk ) and an orientation θ. The time delta between Johnson–Nyquist noise (see [16]).
the current state and the successor is denote by ∆k. Each robot
is equipped with a light detection and ranging sensor (LIDAR) SNR = RSSI − Pnoise (6)
to detect obstacles in vicinity. For example, with a channel width ∆fc = 10 MHz, a carrier
C. Simulation Of Communication distance ∆ft = 15 kHz, a 256 QAM modulation and 2x2
MIMO a maximum transmission rate of 144 Mbit s can be
A necessary condition for autonomous agents to form a achieved. However, with SNR = 20 dBm and RSSI =
swarm is the information transfer via communication. The −69 dBm only 144 Mbit 2 Mbit
s · 3 = 96 s remain [10].
robots are equipped with configurable wireless network in-
terfaces to achieve emergence based on distributed and shared D. Operator
knowledge. Therefore, we created a network simulation model The simulation offers the operator to configure all parame-
based on physical wave propagation including signal power ters of the entire simulation including the robot physics, its
dissemination depending on specified frequencies, free-space planning technique and communication module bandwidth,
path loss and obstacles, but it does not cover reflections yet. antenna gain and frequency. Moreover, the behavior of objects
Moreover, noise levels of the environment and active signal can be scripted. The description is done by YAML files,
jammers can be placed. This realistic network simulation so no programming knowledge is needed. During runtime
serves as analysis tool and developing strategic behavior of commands can be sent by convoy vehicles, which are steered
signal changes and losses of groups of swarm agents. Fur- by the operator, to change the missions of the swarm at any
thermore, one of the basic missions is to create a MANET in time.

43
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

IV. PARALLEL S IMULATION OF S WARM B EHAVIOR physics like positions of convoy vehicles and robots based
on their control values. As a result, it represents mostly
actuators, these need a higher update rate of approximately
60 Hz to simulate smooth movement without glitching through
obstructions. Furthermore, there exists a rendering thread with
read only access to all data structures. Hence, it has almost
Agent Convoy no influence on the simulation performance.
vehicle
Trajectory
Trajectory data
Object detection
Obstacle data
Update
Rendering
Environment
Mission data
Movement data
Sensor range Dynamic
Thread Convoys Robots Static Obstacles
Obstacles

Dynamic Static
omp Thread Convoy
Obstacle Obstacle Terrain
List
List List
Fig. 3. Example scenario to indicate message types Abstract
Data

Read Only Robot Robot


Figure 3 shows a basic scenario of the simulation and Model 1 Model M

indicates the four available message types. In general, all in- Sending

formation are published by broadcasting messages. Important


information such as mission messages, which are specified by Update
Connections
Position Trajectory

the convoy, are forwarded with the echo algorithm [17] to


Object
reach as many agents as possible. Furthermore, the convoy Detection

sends its movement data with another message. The robots Radiation
Obstacles Dynamic
Obstacles Obstacle
themselves broadcast information about obstacles, which enter Sample 1
Data
Trajectory Static Obstacles
their sensor range, and transmit the planned trajectory to all Noisefloor Planning
Sample 2

directly reachable nodes. Each trajectory covers a planning ... Ground

Missions
horizon of 32.5 s. When obstacles or an increasing noise floor, Sample N
Missions

e.g. created by signal jammers, cause a connection loss, then a Robot Models
Robot
Models
Receiving

special cost function [10] leads the robot to reduce the distance
to the other agents. Therefore, the last transmitted trajectory
is used to predict the position and movement of those. But
each agent is developed as an independent autonomous agent Fig. 4. Thread and data structure of simulation
as well.
The simulation of swarms and dynamic obstacles is inher- A. Object Detection
ently concurrent because every robot forms an autonomous Each robot recognizes obstacles of the environment and
instance consisting of parallel hardware and all robots run in receives missions. This information is internally transformed
parallel. A robot is equipped with sensors for object detection, to cost functions and processed by the control algorithm in
a communication modules that can send and receive in parallel combination with the dynamic model resulting in the future
and a single board computer to process the collected data and trajectory of the robot. Afterwards, the generated trajectory
to run the control algorithm. Thus, four independent threads is sent to other robots. Although the data flow seems to
per robot are created (see Fig. 4) for representing the reality. be simple, because the sensor just takes information from
In order to analyze the real-time constraints, we measure the the environment, the parallelism leads to concurrent access
timings of these threads and check them for deadline misses. on those environmental data. To avoid conflicts, the data
Thereby all threads have the same priority in the simulator. structures allow just read-only access from robots, whereas
Thus, for this paper we present the data for the trajectory the common environment update thread is allowed to modify
planning thread, which is the most computationally intensive the data. Because of the high update rate in comparison to
thread with the smallest hard deadline of 300 ms among these the slower rate of the object detection and trajectory planning
four threads. The deadline is a result of the maximum robot thread, the feasible small inaccuracy is equal to sensor noise.
velocity and its size. Consequently, no synchronization is needed here.
In addition, the simulation of the surrounding environment
is split into two independent threads. One thread updates con- B. Data Transfer
nection qualities between the robots. It should be configured The sending thread is triggered by flags which are set by
with a low update rate between 5 Hz and 20 Hz, because on the planning thread after a new trajectory was calculated or
the one hand lightly modified positions often do not cause by the object detection after a new static object was detected.
changes in signal quality in outdoor scenarios and on the Additionally, it can be triggered to forward received packages.
other hand network interface cards react relatively slow to To provide a realistic data transfer, all data is copied by this
signal changes. The second thread calculates the rest of the thread and packed into a message object. Thus, read only

44
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

access to the data is granted and no synchronization is required These are used to evaluate discrete samples, which are created
because only the newest data has to be transmitted. Since data by the guided random walkers of C-PBP [8] (see Fig. 1).
can only be transmitted by the sender thread, it is guaranteed For a planning horizon of 32.5 s, which correspond to 60
that at most one upload to one station is performed as in a samples, and 24 parallel random walkers 1440 function calls
real system. Based on the message size and the transmission per entity have to be evaluated. Hence, it is necessary to plan
rate, the transfer time is calculated and causes the thread to the access to reduce the number of locks and synchronizations
sleep for the specified time. Afterwards, the receive callback as much as possible. The position vector is read without any
thread of the receiver is called. locks. This will possibly lead to inaccuracies if the vector
The flags themselves do not have to be protected by locks gets updated during reading. The resulting inaccuracy is sig-
either, because an inconsistent flag leads to package duplica- nificantly smaller than inaccuracies produced by simultaneous
tion or loss. Both consequences are typical for wireless ad hoc location and mapping algorithms used on real robots; therefore
networks and make the simulation more realistic. Furthermore, it can be tolerated. Processing of the obstacle data need
duplicates get caught by the message filter based on identifica- no synchronization because they are only accessed by read
tion numbers, and package losses cause other robots to predict operations. Static objects in these lists do not get deleted
the state of the sender based on a previous trajectory, which because they never change, whereas the influence of dynamic
is in general about 300 ms old if no consecutive losses occur, obstacles decreases and they will be deactivated by a flag after
but covers 30 s. If messages containing obstacles information some time in case they leave the sensor range. When the lists
get lost, then the robot will detected this obstacle on his own are too long, a synchronized purge for these can be applied,
if it is in a sensor range. which guarantees the consistency of the lists. Nevertheless,
the essential parts of robot models and the missions have to
C. Process Received Information be protected by locks. Those parts are the prediction of states
In contrast to the sending thread, the receiving thread has and connections based on current information, The models and
to be aware of memory consistency. Therefore, it has at first missions are updated via the receiving thread and the update
a general lock which prevents receiving multiple packages thread. thereby the receiver data is more important because
at the same time, which is not possible in a real system it contains the correct information from the senders. On the
either. This causes the sending thread to block after calling contrary, the update thread estimates those data based on older
the receiving callback. This blocking is similar to the IEEE information. After this computation finishes, the trajectory data
802.11 Carrier Sense Multiple Access (CSMA) Media Access will be overwritten by this thread and the sending thread is
Control (MAC) protocol for wireless networks, where the activated by a flag afterwards.
sender waits for a random back-off time if the channel is It does not matter if the sending thread is scheduled too
currently used. However, the blocking time is short because of late because only the newest trajectory should be transmitted.
the small data packages. The largest package is the message The time stamp of each message is used to drop older
that contains the trajectory. It includes 1472 B, where the trajectory information. Furthermore, if an inconsistency of the
header, consisting of an identification number, a timestamp, the flag occurs because of parallel writing, it will cause a package
size information and the message type, consumes 16 B. The loss. But for each pair of system state and control vector of
other packages are much smaller, because the movement data the trajectory the corresponding time is transmitted as well.
of the convoy vehicle just contains the position, the velocity Thus, it is possible to predict the behavior of the agent for at
and its identification number, which are 28 B+16 B = 44 B in least 32.5 s, which is equivalent to more than 100 consecutive
total. The mission and obstacle messages including the header package losses. The trajectory has exponentially distributed
are only 37 B and 28 B in size, respectively. time samples. Hence, the beginning can be predicted really
As shown in Figure 4, the receiver processes the data to accurate and the later the point in time of the trajectory,
update the obstacles data, the missions and the robot models. the more inaccurate the estimation is. This corresponds to
A robot model is a data structure to manage the received the general observation that the older the message the more
information about other robots. It offers state and connection inaccurate the information is.
quality prediction based on the planned trajectory as well as
information about the explored area through the robot that is E. Timing Analysis And Performance Improvement With
linked to the model. As a result, these data structures have OpenMP®
to be protected by locks to guarantee consistency of the data,
whereas the temporal order is not that important, because all The planning thread runs periodically with a relative dead-
data get constantly updated. line of 300 ms, but its computation time obviously increases
with the number of robots in the swarm. Consequently, no
D. Control And Planning fixed timing analysis is possible because the execution times of
The computationally intensive trajectory planning thread the threads depend quadratically on the swarm size. However,
to control robot behavior needs to access almost all data this statement only applies to the simulation because the
structures, because each entity in the environment, such as number of objects to be managed per robot only increases
obstacles and missions, forms an independent cost function. linearly in a real system but simulation also has to handle the

45
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

increased number of robots. Equation 7 emphasizes that the marked in black in Figure 5. Orange, yellow and green
simulation must manage also more robots. illustrate parts of the update thread for different subjects
R
X R
X E
X M
X and blue indicates the trajectory planning thread. Equation 9
objectsmanaged = ( + + ) underlines the increasing number of threads because of this
robot=1 model=1 object=1 mission=1 technique.
(7)
2
#env = 4 + #omp , #con = 1, #robot = 3 + #omp · 5 (9)
∈ O(R + R · E), with M << R ≤ E
52 4 + 16 + 1 + (3 + 16 · 5) · R 83
Let R denote the number of robots, E the number of objects → ≤ < (10)
3 1+1+4·R 4
in the environment without the robots and M the number of
Let #omp denote the number of threads that are created by
missions, which is currently at most 3.
OpenMP® , which is per default the number of virtual cores.
In general, the simulation generates much more threads than
The loop to update the robot models and the loop for the
available CPU cores even on machines with a large number
samples is unrolled into this amount of threads (see Figure 4).
of cores like compute servers.
Therefore, later one causes an approximately 54 #omp increase
R
X of threads. For a modern octa-core CPU with 16 virtual cores
#t = #env + #con + #robot = 1 + 1 + 4 · R (8) and R = 60 simulated robots, the fine granular parallelization
robot=1
produces around 5001242 ≈ 20.67 times more threads (see
Equation 8 shows that the number of threads #t exceeds Equation 10).
the number of cores of standard PCs even with 3 robots. In
this Equation #env denotes the number of threads for the V. A NALYSIS
update of the environment and #con the amount of threads In the following we illustrate how to create missions and
for the connection update. #robot corresponds to the number analyze the usability of the simulation environment. Further-
of threads per robot, like it is presented in Figure 4. To more, the performance of the former presented prallelization
deal with different swarm sizes and timings which can be techniques for simulations are evaluated on a real system.
configured by the user, these threads are split into smaller
workload. Therefore, the scheduler can utilize each core as A. MANET
good as possible to prevent deadline misses. The separation The MANET cost function should create a redundant re-
into smaller workloads is done with OpenMP® and is indicated liable network with the robots as base stations that covers
by the orange rectangles with rounded corners in Figure 4. as much area as possible. Hence, as long as the connection
During development, care was taken to ensure that each quality is constant, a large distance between the robots should
data item of data structure resides in consecutive memory be achieved to cover a larger area. If the quality decreases
blocks, but the structures themselves are in different regions it has to be decided whether a larger distance or a higher
of memory. Hence, processing of data can be parallelized by transmission rate is desired, whereas if the connection breaks
subjects on different cores. On the contrary the data structure down, the distance is not a positive aspect anymore and has to
access should not be parallelized to avoid cache conflicts. be reduced. Only two reasons for a connection loss exist; either
The control software is designed to run on single board the distance or a change in the environment, which cannot
computers on real robots, which are in general multi-core always be undone by the robot itself. Therefore, the only
systems. The trajectory planning thread handles 24 indepen- possibility to reestablish the connection is distance reduction.
dent random walkers for 60 time steps. So, each of the 24 Equation 11 combines these aspects.
samples can be processed on different cores as well as each 
data structure (see Figure 4). The benefit of parallel updates −(d · fd + q · fi ) q > 0
costs(d, q) = (11)
and processing of data is that ideally the planning thread works d else
with the newest information and that the information of a data Let d be the distance and q ∈ [0, 1] the signal quality. fi and
structure correspond to the same time instance. Thus, they are fd denote weights to prefer better connection quality or larger
not changed during planning nor do there exist different time covered area. In general, negative costs indicate desired and
levels inside the structures (see Figure 5). positive costs repulsive areas for the optimizer.
The evaluation of this function does not lead to a circular
Subject based parallel processing Standard parallel processing
or star-shaped arrangement, which corresponds to a high
Core 1 Obstacles Planning Planning Plan. Plan. and slightly redundant network coverage of an area. The
Core 2 Mission Models Plan.
one time period
Plan. Obstacles Models Mission
one time period
experiments present approximately equilateral triangles (see
0 t 0 t Figure 6). It can be explained by optimal characteristics of
equilateral triangles regarding to maximizing distances (see
Fig. 5. Advantages and disadvantages of subject based parallelization
Figure 7). Another idea of always maximizing the minimum
distance is not appropriate as well because the connection
The disadvantage of these fine granular threads is increasing with the minimum distance does not have to be the opti-
computation time because of context switches, which are mal connection due to obstacles and sources of interference,

46
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

B
Performance comparison with and without OpenMP

140
D

120
E

Trajectory calculation time in ms


A C
100

80

Fig. 6. Real distribution of the robots Fig. 7. Theoretical distribution of the 60


[10] robots [10]

40

whereas the modification of the distance term into a strictly


monotonous and concave logarithmic function serves as a good 20

solution because it prefers small equal distances between the


robots, instead of one large and many smaller distances (see 0
3* 3 6* 6 12* 12 18* 18 24* 24 30* 30 45* 45 60* 60
Equation 12). Number of simulated robots

−(ln(d) · fd + q · fi ) q > 0 Fig. 10. Comparison between coarse and fine parallelization. Numbers with
costs(d, q) = (12)
d else stars indicate that OpenMP® was used additionally.

The correctness of the adjusted cost function is explained


in [10] and shown in Figures 8 and 9, which present the activated which causes the robots to handle 10 cost functions
desired ring and star structures. Here the position of the blue per robot model and for itself inside the mission data structure
marked convoy vehicle is not relevant and does not effect the (see Figure 11). Thus, for example, a robot inside a swarm of
cost functions, as well as the black circles which are only 60 agents has to handle 682 objects and 60 constantly updated
for visualization purpose. A red circle illustrates the sensor robot models without the messages.
range of a robot, which is in the center. Green lines mark the
connections and their qualities. Performance comparison with and without OpenMP

140

120
Trajectory calculation time in ms

100

80

60

40
Fig. 8. Ring topology with inner and
Fig. 9. Combined ring and star topol-
outer circle consisting of 30 robots
ogy with 30 robots [10]
[10] 20

0
B. How Much Parallelization Is Needed? 3* 3 6* 6 12* 12 18* 18 24* 24 30* 30 45* 45 60* 60
Number of simulated robots
Figure 10 shows the comparison between fine granular and
coarse thread level prallelization for a simulation of an empty Fig. 11. Performance comparison for complex environments. Numbers with
environment with different swarm sizes. The results are tested stars indicate that OpenMP® was use additionally.
to a significance level of α = 0.01 to prove that the context
switches and thread handling cost more performance than the In both cases the deadline is never missed, but the distance
parallel calculation achieves. This holds even for complex of the means between Figure 10 and 11 decreases almost
environments with all different types of terrain accessibility, linear with a slope of −0.2 with the number of robots, which
noise floors with two signal jammers at different positions, 20 proves the performance increase of the parallel processing.
static and 20 dynamic obstacles in vicinity and 22 static and However, it is not sufficient for this simulation. In contrast
20 dynamic obstacles for the connections, which are important to this, reducing the parallelism to the number of cores with
for the prediction. Moreover, the exploration mission was consecutive access to the data structure is not possible either. If

47
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

we reduce the number of threads to the number of cores, then missions. In addition to that, the control theory will be im-
some threads with different deadlines would be combined. So, proved to support hard and soft constraints to give guarantees
the computation time would increase by reducing the deadline and
to the smallest at the same time or active polling has to be This research was funded by the Deutsche
used, which wastes CPU cycles as well. In addition to that, the Forschungsgemeinschaft (DFG, German Research Foundation)
realism of the hardware abstraction and the portability to real – 276879186/GRK2193 [Gefördert durch die Deutsche
systems would be lost. Hence, the OpenMP® parallelization Forschungsgemeinschaft (DFG) – 276879186/GRK2193]
should be used only on real robots, which does not have to
R EFERENCES
handle so many threads. For real robots only the planning
and the object detection thread remain because the rest is [1] L. E. Parker, “Distributed intelligence: overview of the field and its
application in multi-robot systems,” Journal of Physical Agents (JoPha),
done by hardware or exists in reality and does not need to vol. 2, no. 1, pp. 5–14, 2008.
be simulated. [2] N. Koenig and A. Howard, “Design and use paradigms for gazebo, an
open-source multi-robot simulator,” in IEEE/RSJ International Confer-
ence on Intelligent Robots and Systems, Sendai, Japan, 2004, pp. 2149–
VI. C ONCLUSIONS 2154.
[3] M. Quigley, K. Conley, B. Gerkey, J. Faust, T. Foote, J. Leibs,
We present a novel approach to control large autonomous R. Wheeler, and A. Ng, “Ros: an open-source robot operating system,”
robot swarms. For this purpose, a real-time simulation tool ICRA Workshop on Open Source Software, vol. 3, 2009.
[4] T. Chung, “Darpa subterranean (subt) challenge,”
to evaluate control algorithms for autonomous robots in a https://www.darpa.mil/program/darpa-subterranean-challenge, 2017.
swarm is explained in detail. It is described how potential [Online]. Available: https://www.darpa.mil/program/darpa-subterranean-
functions can be used to create different behavior and mis- challenge
[5] N. Palmieri, X.-S. Yang, F. D. Rango, and A. F. Santamaria, “Self-
sions for the entire swarm without directly assigning tasks adaptive decision-making mechanisms to balance the execution of
to specific agents. To achieve emergent behavior which is multiple tasks for a multi-robots team,” Neurocomputing, vol. 306, pp.
based on limited local knowledge, realistic communication 17–36, 2018.
[6] N. Palmieri, X.-S. Yang, F. D. Rango, and S. Marano, “Comparison of
conditions are necessary. Thus, we outline an integrated model bio-inspired algorithms applied to the coordination of mobile robots con-
of wireless communication channels which is used inside sidering the energy consumption,” Neural Computing and Applications,
the simulation to form a realistic environment. Afterwards, vol. 31, no. 1, pp. 263–286, 2019.
[7] L. Grüne and J. Pannek, Nonlinear Model Predictive Control: Theory
techniques to improve calculation times of the computational and Algorithms, 2nd ed., ser. SpringerLink Bücher. Cham: Springer,
intensive parts of the simulation are explained in detail. 2017.
As a result, efficient data structures are introduced and we [8] P. Hämäläinen, J. Rajamäki, and C. K. Liu, “Online control of simulated
humanoids using particle belief propagation,” in Proc. SIGGRAPH ’15.
are discussing their access strategies from different threads. New York, NY, USA: ACM, 2015.
A sophisticated implementation reduces the number of syn- [9] T. Laue, “Eine verhaltenssteuerung für autonome mobile roboter auf der
chronization steps and increases performance. After that, all basis von potentialfeldern,” Diplomarbeit, Universität Bremen, Bremen,
5. Januar 2004.
parallelization possibilities are pointed out and analyzed. As [10] A. Puzicha, “Modeling and analysis of a distributed non-linear model-
a result, the fine granular parallelization with OpenMP® has a predictive control for swarms of autonomous robots with limited com-
measurable performance increase, but the offset of the increase munication skills (in german),” Masterarbeit, TU Dortmund, Dortmund,
2019.
caused by the context switches is too high. That is why this [11] J. H. Reif and H. Wang, “Social potential fields: A distributed behavioral
fine grained parallelization cannot be recommended for the control for autonomous robots,” Robotics and Autonomous Systems,
simulation. Contrary to this observation, fine grained paral- vol. 27, no. 3, pp. 171–194, 1999.
[12] L. Lamport, “Time, clocks, and the ordering of events in a distributed
lelism is valuable for implementation of the control algorithms system,” Commun. ACM, vol. 21, no. 7, pp. 558–565, 1978. [Online].
on real robots, because there are significantly less threads on Available: http://doi.acm.org/10.1145/359545.359563
a robot than in the simulation which have to be managed. So [13] LTE-Anbieter.info, “Maximale datenrate der luftschnittstelle bei lte:
Wie errechnet sich diese eigentlich?” LTE-Anbieter.info, 2019.
there are usually cores available for additional threads. [Online]. Available: https://www.lte-anbieter.info/technik/datenrate-
In the next step, the control software will be evaluated on luftschnittstelle.php
[14] Ernst Ahlers, “Funk-übersicht: Wlan-wissen für gerätewahl und
real robots. By using emulation, the simulation environment fehlerbeseitigung,” c’t, vol. 2015, no. 15, pp. 178–181, 2015.
and the real robots are fused. This offers the ability to test real [Online]. Available: https://www.heise.de/ct/ausgabe/2015-15-WLAN-
robots as agents of a large swarm without having the complete Wissen-fuer-Geraetewahl-und-Fehlerbeseitigung-2717917.html
[15] H. T. Friis, “A note on a simple transmission formula,” Proceedings of
swarm available; instead, most of the robots are still virtual, the IRE, vol. 34, no. 5, pp. 254–256, 1946.
only some are physically available to perform their tasks. [16] W. Heywang and R. Müller, Rauschen. Berlin, Heidelberg: Springer
Additionally, the behavior in different complex scenarios can Berlin Heidelberg, 1990, vol. 15.
[17] W. J. Fokkink, Distributed algorithms: An intuitive approach, second
be evaluated without building them in reality. This becomes edition ed. Cambridge, Massachusetts and London, England: The MIT
necessary in extreme environmental conditions after disasters Press, 2018.
which can hardly be reconstructed in a test bed. On the other
hand, the simulation is tested on real data and can monitor
real robots.
In further research we expand the available mission by
logistic, formation and rendezvous with robots and objects

48
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Heuristic Contention-Free Scheduling Algorithm


for Multi-core Processor using LET Model
Shingo Igarashi∗ , Tasuku Ishigooka† , Tatsuya Horiguchi† , Ryotaro Koike∗ and Takuya Azumi∗
∗ Graduate School of Science and Engineering, Saitama University
† Center for Technology Innovation - Controls, Research and Development Group, Hitachi, Ltd.

Abstract—Embedded systems, e.g., self-driving systems and up task processing; however, such processors have problems
advanced driver-assistance systems (ADAS), require computing in terms of predictability and temporal determinism.
platforms with high computing power and low power con-
sumption. Multi-/many-core platforms satisfy these requirements
In clustered many-core systems, resource contentions, e.g.,
effectively. However, for hard real-time applications, multiple contentions induced by shared memory and cache, prohibit
demands on shared resources can impede real-time performance, the system from satisfying real-time requirements. Therefore,
and memory is one resource that can impair the desired per- avoiding contentions and accurately calculating delay caused
formance significantly. Therefore, it is important that memory by contentions are required. We address this issue using the
access timing be deterministic to facilitate predictability. To
realize this, the Logical Execution Time (LET) paradigm is
Logical Execution Time (LET) model [4] [5]. The LET model
currently attracting attention. This paper proposes a theoretical is to perform communication at fixed timing determined by
scheduling method for a model applying the LET paradigm the LET section. A task to which the LET model is applied
to directed acyclic graph (DAG) nodes for a multi-/many-core accesses memory at the same timing in each period. We allow
platform. The proposed method considers communication timing concurrent executions on multiple cores and coordinate access
between nodes and generates a schedule that does not cause
communication contentions. In addition, the proposed method
to shared memory using a time-triggered schedule. Since
attempts to distribute tasks and reduce LET intervals to address access timing is adjusted to avoid overlap, communication time
increased execution times due to the implementation of the LET and task execution time are always constant. However, when
paradigm. In the evaluation, we observed that the proposed adopting the LET model, the task execution time is set to be
method improved the schedule length by up to 40%. greater than the actual time; thus, the time from sensor data ac-
Index Terms—multi-rate DAG, multi-/many-core, communica-
quisition to application execution, i.e., end-to-end latency, may
tion contention, list-scheduling, logical execution time
increase. Furthermore, there are multiple different periodic
tasks in an automotive application. To address these issues,
I. I NTRODUCTION
it is necessary to schedule jobs to occur in a hyperperiod in
Embedded systems, such as self-driving systems (e.g., Au- consideration of job dependencies and communication timing.
toware [1]) and advanced driver-assistance systems (ADAS), In addition, we provide application modeling that combines the
require high computing capacity and low power consump- LET paradigm and the DAG. We propose a method based on a
tion. Multi-/many-core hardware for embedded systems (e.g., list scheduling mechanism that supports distributed processing
Kalray MPPA [2], and Tilela TILE-Gx [3]) satisfy these and multi-rate periods.
demands and are the focus of active study. Multi-/many-core Contributions: Our primary contributions are summarized
hardware for embedded systems is suitable for parallel task as follows.
processing and large-scale computations. In addition, clustered
• We propose a theoretical method for parallel and dis-
many-core systems, such as the Kalray MPPA processor,
tributed processing of LET tasks using multi-/many-core
provide highly scalable, and isolated areas of computation.
processors while avoiding communication contentions.
In a self-driving system, multiple applications run simulta-
• We propose DAG scheduling, which executes distributed
neously, and each application has a deadline. Different appli-
processing and reduces idle time in the overestimated
cations, e.g., automatic brake and collision warning systems,
LET interval to reduce increased execution time by
utilize various types of sensor data and execute multiple
implementing the LET paradigm.
processes. Such applications must execute (from sensor data
• We reconstruct the job dependency generated from the
acquisition to termination) before the deadline. The procure-
task in the hyperperiod to make it compatible with
ment of a directed acyclic graph (DAG) with an end-to-
applications with multi-rate periods.
end deadline to achieve parallel distributed processing can be
considered as a real-time application of automotive systems. The remainder of this paper is organized as follows. The
In this paper, we propose a static DAG scheduling method system model and motivation of this study are described
to satisfy deadlines by accelerating processing using multi- in Section II. Section III discusses assumptions about the
/many-core hardware. Multi-/many-core hardware can speed scheduling problem. A scheduling approach is provided in
Section IV, and Section V proposes methods to reduce execu-
978-1-7281-7343-6/20/$31.00 ©2020 IEEE tion time to eliminate deadline-miss. Experimental methods,

49
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

IO cores ,2&OXVWHU ,2&

logical Logical Execution Time (LET) IO 0


(k1-core)
IO 1
(k1-core)
IO 2
(k1-core)
IO 3
(k1-core)
1RUWK,26XEV\VWHP ,26

''5 ,2FRUHV
3&,H

3&,H
*%
1R&LQWHUIDFH
time
execution
Instruction Instruction Instruction Instruction

physical read write Cache: 8KB Cache: 8KB

Data Cache: 128KB


Cache: 8KB Cache: 8KB

(WKHUQHW

(DVW,26XEV\VWHP ,26
Compute Compute Compute Compute
Cluster Cluster Cluster Cluster
IO SMEM (SRAM: 2MB) (CC0) (CC1) (CC2) (CC3)

1R&LQWHUIDFH
Fig. 1. Task execution using the LET model.

,2FRUHV
Compute Compute Compute Compute

:HVW,26XEV\VWHP ,26
Cluster Cluster Cluster Cluster
(CC4) (CC5) (CC6) (CC7)
Resource Manager (RM) Debug Support Unit

1R&LQWHUIDFH
IC DC (DSU)

,2FRUHV
results, and considerations are presented in Section VI, and Compute Compute Compute Compute

60(0 65$00%
Cluster Cluster Cluster Cluster
PE 0 PE 1 PE 2 PE 3 (CC8) (CC9) (CC10) (CC11)
IC DC IC DC IC DC IC DC

(WKHUQHW
related work is discussed in Section VII. Finally, conclusions PE 4
IC DC
PE 5
IC DC
PE 6
IC DC
PE 7
IC DC
Compute
Cluster
(CC12)
Compute
Cluster
(CC13)
Compute
Cluster
(CC14)
Compute
Cluster
(CC15)
PE 8 PE 9 PE 10 PE 11
and directions for future work are presented in Section VIII. IC DC IC DC IC DC IC DC
1R&LQWHUIDFH

3&,H
''5
PE 12 PE 13 PE 14 PE 15 3&,H
,2FRUHV *%
IC DC IC DC IC DC IC DC

6RXWK,26XEV\VWHP ,26
II. S YSTEM MODEL
NoC Interface NoC
DMA Rx Tx Micro Core (UC) Router
,2&OXVWHU ,2&
&RPSXWH&OXVWHU &&

A. Application Model Fig. 2. Architecture of Kalray MPPA2-256 Bostan.


Modeling data flow is required to schedule the mandatory
In addition, there is a large-capacity shared memory outside
applications in an automotive system [1]. We employ a DAG
the CC. Note that access to the external memory requires more
to model data flow (Fig. 5). An automotive system can be
time than access to the local memory in the CC. One core
represented as a DAG by representing each process as a
in the CC manages the memory access inside and outside the
node of the DAG and data flow to or from a process as a
CC, and transmits data. All remaining cores are computational
directed edge. Here each process (i.e., node) requires time
cores. The Kalray MPPA many-core processor architecture
for computations, and each end node (i.e., application) has
satisfies these requirements [2].
a deadline. The presence of a directed edge indicates that a
The Kalray MPPA2-256 Bostan has 16 CCs in the center,
pair of nodes has an order constraint or data dependency. In
and each CC has 16 cores (Fig. 2). The Kalray MPPA2-256
addition, a DAG is multi-rate because it receives data from
serves as a means of communication between CCs assembled
multiple periodic sensors.
in a network-on-chip (NoC) formulation [2]. Here, each CC
B. LET Model has 2 MB of SRAM memory shared locally, 16 cores of each
Here, we introduce our LET model. The LET model fixes CC access their shared memory through a bus connection,
the processing time from read processing (input) to write and each I/O cluster contains 2 MB of SRAM and 2 GB of
processing (output) regardless of the actual task execution DDR memory. In addition, each core has a private instruction
time (Fig. 1). At the beginning of the LET section, the task and data cache, and is connected to the local shared memory
reads (inputs) the required data into the local memory area through a bus [7]. The resource manager (RM) in the CC is
where the task is arranged. When the task is executed, data responsible for managing memory access, and the RM can
are exchanged with the local area such that the task can be copy data from external memory to local memory.
executed without accessing shared memory (Section II-C). Memory Bank Privatization : Memory bank privatization is
After the task execution is complete, the updated data are one of the functions of the MPPA2-256 Bostan required to
written (output) to the shared memory at the end of the LET consider [6] [8]. Local memory in the CC is divided into 16
section. According to the LET paradigm, access timing to local memory banks (128 KB each), each of which is assigned
shared data becomes decisive. In multi-/many-core systems, to each core as a private memory bank (Fig. 3). Here, each core
shared data can be accessed by multiple cores, and, generally, has a private access path to its private memory bank; therefore,
it is necessary to implement an exclusive control system. there is no interference between a core and its private memory
However, by implementing the LET paradigm, we can realize bank. By combining this memory bank privatization with
a design that does not require exclusive control. the aforementioned, the LET model enables task execution
The LET section must be equal to or longer than the latency without contentions. Fig. 3 visualizes the execution of the task
of the task (i.e., computation + communication time), and assigned to the core in the upper right. Global copies of the
typically, the task period is adopted. Therefore, as shown in communication variables are stored in the global bank, which
Fig. 1, there may be a time during which no processing is is a different bank in CC that can be accessed from all cores
performed in the LET section. We regard each job originating as shared memory for communication. Note that interference
from one node of the DAG as one LET task, and set the length is possible when accessing this global bank.
of each LET task to be less than or equal to the period of that
D. Motivation
node (Section III-B).
Next, we discuss the motivation of this study. Consider
C. Architecture Model the simple DAG in Fig. 4(a) using two cores. In Fig. 4(b),
In this section, we describe the assumed architecture model. assume that this DAG is scheduled while observing the priority
This model is compliant with the existing model [6]. In constraints. In practical applications, communication time is
this paper, a single DAG application is executed by a single consumed to exchange data between tasks, and contention for
compute cluster (CC). Each CC has multiple cores and local shared resources may occur during communication; thus, the
shared memory, which is connected by a shared bus, and each application execution timing varies due to delay. Therefore,
core has its own scratchpad memory or private memory bank. we schedule communication phases using the LET model.

50
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

128KB ¼ 16 Ҹ 2MB
one bank is assigned to each core n2
Application Output
16 cores in one CC
n1 n4 n1 n2 n4

2MB SRAM
Core 1

n3 Core 2 n3

read write (b) simple task scheduling.


(a) DAG example of
motivation. Application Output
Application Output

Reduction
Core 1 n1 n2 n4 Core 1 n1 n2 n4.1
*% Core 2 n3 Core 2 n3 n4.2
do not use this core for computation
global bank
(shared memory bank)
(c) LET task scheduling. (d) Reduction of the latency.
Fig. 3. Memory bank privatization. Fig. 4. Motivation of this study.

Fig. 4(c) shows a task schedule image using the LET model. A. Application Notation
Execution and communication processing are scheduled in In the following, we describe the proposed scheduling
each LET task, where communication processing is scheduled method using the DAG example shown in Fig. 5. Note that all
to not overlap with the communication processing of other parameters are expressed in unit time. Here, DAG G comprises
LET tasks. Applying the LET model to tasks eliminates the nodes V (G) and edges E(G). The computation time of node
need to consider communication delay; therefore, application ni ∈ V (G) is expressed as comp(ni ). This computation time
execution timing is expected to be constant. The method with- represents the measured execution time of the task when no
out idle time is referred to as read-execute-write semantics, contention occurs with other tasks in a single core. This value
which is frequently used for scheduling methods that consider is assumed to be acquired offline. Data are communicated
contentions. Compared to this method, the LET model is between the nodes. The side entering a certain node indicates
very flexible because the execution time of each process may the amount of data to be read by the node, and the side
change during development due to the addition of functions. exiting the node indicates the amount of data to be written.
In read-execute-write semantics, if functions are added, in the The required communication time of node ni , i.e., comm(ni ),
worst case, we may need to change the overall schedule (i.e., for the entire node is calculated as follows.
core allocation and execution order). However, the extra idle comm(ni ) = (datari /Dslot  × Tslot ) + (dataw
i /Dslot  × Tslot ) (1)
time in the LET model eliminates the need for such schedule
changes, thereby improving development efficiency. Here, datari and datawi represent the amount of data to be read
While the LET model has these advantages, it also has and written, respectively, and Tslot is the maximum allocation
disadvantages. The chain of tasks from sensor data (i.e., time determined by a round robin policy [9]. Dslot is the
the entry node) to application execution (i.e., the exit node) amount of data that can be transmitted during Tslot . The com-
is referred to as end-to-end latency. The LET section is munication time for edge ei,j ∈ E(G), i.e., comm(ni , nj ), is
longer than the task execution time; therefore, when the LET calculated as follows.
model is adopted, end-to-end latency can be longer than when comm(ni , nj ) = (data(ei,j )/Dslot ) × Tslot (2)
other models are adopted (Fig. 4(c)). As a result, application Here, ei,j represents a side connecting node i to node j,
execution is delayed, and a deadline-miss may occur in the comm(ni ) is used to set the LET section and determine the
worst case. communication time of the LET, and comm(ni , nj ) is used
We address this issue using a multi-/many-core processor to assign task priority.
to distribute LET tasks, and we set the idle time in the LET The LET section includes both communication and execu-
section appropriately. Fig. 4(d) shows the results of applying tion processing; thus, the Worst-Case Execution Time (WCET)
these techniques to Fig. 4(c). For tasks n2 and n3 , the idle time of node ni can be calculated as follows.
in the LET section is reduced. However, the longer the idle
W CET (ni ) = comp(ni ) + comm(ni ), (3)
time in the LET section, the greater the merit of introducing
LET (but latency will increase). Thus, our best goal is to The WCET value is used to set the length of the LET section
ensure that applications do not miss deadlines, and the LET for the task (Section III-B).
section will be reduced gradually. This makes it possible to The blue nodes (n2 in the figure) can be computed in
reduce deadline-misses while leaving as much idle time as parallel. The need for parallel computation must be confirmed
possible in the LET section. For task n4 , the execution time by offline analysis. Hereafter, we label a node that requires
is reduced by distributing processing across two cores. parallel computation as a parallel node. In order to reduce the
length of schedule, a parallel node is computed with multiple
III. S CHEDULING A SSUMPTION cores in Section V-B.
The entrance node of the DAG must correspond to the
This section describes the scheduling assumptions, the set- period of the sensor data to be acquired; therefore, we set the
tings, and constraints for the target problem. period at the entrance node. Nodes other than entrance nodes

51
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

$QRULJLQDO'$* $JHQHUDWHG'$* PXOWLUDWH


With this setting, there may be tasks that cause too much idle
ࡴࡼ(ࡳ) = ૚૜૛ૡ time in the LET section. This issue is discussed in Section
‫݊ ݀݋݅ݎ݁݌‬଴ = 664 ‫݊ ݀݋݅ݎ݁݌‬ଵ = 1328 V-A.
  
 
݊଴ ݊ଵ ݊଴,଴ ݊ଵ,଴ ݊଴,ଵ C. Memory Constraints and Communication Settings


    


 Here, we describe memory to store task code and data. It
 
݊ଶ,଴ ݊ଷ,଴ ݊ଶ,ଵ
is assumed that task code is stored in the private memory
݊ଶ ݊ଷ

 bank of the execution core of the task; therefore, the task
  
 

&RPPXQLFDWLRQGDWD
job is executed on the same core each time. Communication
݊ସ ݊ସ,଴ ݊ସ,ଵ data are read in the read processing of the LET section, and
 'HDGOLQH 

&RPSXWDWLRQWLPH in write processing, updated data are written to the global
memory bank. Typically, the amount of memory required to
‫؟‬nodes that can be calculated in parallel (parallel nodes)
store code for a single task is small compared to the size of
Fig. 5. Original DAG (left) and DAG generated in hyperperiod (right).
the core’s private memory bank; however, multiple tasks can
inherit the maximum period of predecessor nodes of the given be assigned to the same core. As a result, the required memory
node. The hyperperiod of a DAG is the least common multiple usage may exceed the memory size. Therefore, Becker et al.
of the period of each node, i.e., the least common multiple of [6] used an optimization method to determine whether it is
the period of the entrance node. Each node of the DAG creates better to assign a code locally or externally for each task.
jobs in a hyperperiod. Each job inherits the execution time and However, determining the optimal memory allocation is not
communication time of its node. related to our purpose; thus, we assume all code and data
The deadlines of the DAG vary for each application and are fit in memory. Although parallel nodes can be executed on
provided from the entry node to the exit node. Note that each multiple cores, the code for a task is statically allocated to the
job of end nodes also has a deadline. private memory banks of multiple execution cores. Note that
scheduling methods that consider memory usage will be the
B. Setting the Period and Length of LET section focus of future work.
Generally, the length of the LET section is equal to the
D. Scheduling Approach
period; however, we set the LET section with a value smaller
than the period. This is a necessary condition for avoiding Time-triggered schedule: Contention can be avoided
overlap memory access using our LET model. Note that the (even in the memory access phase) by introducing a time-
period and LET section must be longer than the WCET of each triggered schedule [6] [8]. Time-triggered schedule is a
task. Therefore, we calculate the greatest sum of execution and scheduling method that statically determines the activation
communication time in the entire application (i.e., the DAG). timing of a task. As described in Section II-C, combining mem-
In addition, idle time is required in the LET section. We give ory bank privatization and the LET model makes it possible
this margin time α and β. First, the period of entry nodes to execute a task without contentions; however, contention can
nentry is determined. α is a margin for setting periods. The still occur when tasks access the global bank (which can occur
period of entry nodes is calculated as follows. during the read and write phases). A time-triggered schedule
coordinates access to the global bank such that the read and
period(nentry ) = max W CET (ni ) × (1 + α), (4)
ni ∈V (G) write phases do not overlap.
Other nodes inherit the maximum value of the period of IV. P ROPOSED A LGORITHM
predecessor nodes P red() as follows. This section introduces a heuristic approach based on list
period(ni ) = max period(nk ), (5) scheduling for DAG applications using the LET model. List
nk ∈P red(ni ) scheduling can be divided into task-prioritizing and processor-
We schedule jobs generated within the hyperperiod. The selection phases. In the task-prioritizing phase, priorities are
hyperperiod HP (G) is set as follows. assigned to each task. In the processor-selection phase, exe-
cution cores are determined in order from the task with the
HP (G) = LCMni ∈V (G) period(ni ), (6) highest priority.
Here, LCM is a function to calculate the least common A. Generating Jobs in a Hyperperiod and Redefining a De-
multiple. pendency
We set the LET section to be smaller than the period;
Multiple jobs are generated within a hyperperiod. We must
therefore, the LET section is set using margin β with a value
verify these jobs generations and define their dependencies.
greater than or equal to the WCET of the task. β is a margin
Hereafter, we refer to the tasks generated from a single node
for setting the length of the LET section. For simplicity, β is
as jobs.
set to the same value for all tasks in this paper. The length of First, we generate jobs from the tasks in a hyperperiod and
the LET section of task i is expressed as follows. define the dependencies of these jobs. Hereafter, we refer to
LET (ni ) = W CET (ni ) × (1 + β), (7) the tasks generated from a single node as jobs (i.e., LET task).

52
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Initially, we let the period of an entry node inherit to C. Task-prioritizing Phase


all nodes, as discussed in Section III-B. We then generate In the task prioritizing phase, laxity, which represents the
HP (G)
period(ni ) jobs for each node. Here, let Ji be the set of jobs margin to the deadline, is calculated recursively by traversing
generated from node ni and let ni,j−1 be the j-th released job the job graph. Low-laxity indicates that no margin is present
of node ni . from the deadline, and jobs with low-laxity values have
Finally, job dependencies are defined for jobs that satisfy high priority. The equation for calculating laxity will differ
the dependencies of the original node and have the closest depending on whether a successor job exists, whether a job has
tentative release time. The tentative release time differs from the last release, and whether a job has a deadline. The laxity is
the actual release time and is defined as follows. The tentative determined to ensure that a job with a small number of releases
release time of ni,j is denoted trt(ni,j ) and is determined by becomes smaller than a job with a large number of releases
trt(ni,j ) = period(ni ) ∗ j when the first tentative release time to prevent the reversal of scheduling order. Furthermore, a
of all jobs is zero. Note that the actual release time is the successor job is considered to have larger laxity than the
maximum finish time of the preceding task plus communica- predecessor job so that job dependency destruction can be
tion time. We schedule the newly generated DAG graph on avoided.
the right side of Fig. 5. We define the deadline of end nodes D(nexit ) as follows.
B. Generating Execution Job List D(nexit ) = CriticalP ath Cost(nexit ) × (1 + γ), (11)
Next, we organize the jobs to be scheduled. Here, each job Here, CriticalP ath Cost is the maximum path to the end
of the DAG is considered a single LET job, and the length of node (computation time + communication time), and γ is a
the LET section is determined as defined in Section III-B. margin for setting deadlines. The deadline of a job increases
One LET task ni,j includes two processes, i.e., execution with the number of releases, i.e., D(ni,j ) = D(ni ) × (j + 1).
process Ei,j and communication process Ci,j . In addition, one We define the laxity for jobs with deadlines laxity(nexit,j )
communication process Ci,j includes two phases, i.e., reading as follows.
process Ri,j and writing process Wi,j .
laxity(nexit,j ) = D(nexit,j ) − e(nexit,j ), (12)
To avoid contention, we only need to ensure that the timing
of communication processes does not overlap. Therefore, we Depending on the period of an entry node, a job that has
schedule the communication process of all LET tasks first, no successors and deadline may occur. This is a job that has
and then schedule the execution process of all jobs. If the a lower period than the successor node and is not used for
timing of the reading process is determined (i.e., the start successor jobs. This type of job should have a smaller laxity
time of the LET task), the timing of the writing process is than later released jobs as follows.
also determined. A schedule is required to ensure that this laxity(ni,j ) = laxity(ni,j+1 ) − e(ni,j ), (13)
communication process does not overlap among all LET tasks.
Due to our problem setting, no contention occurs during the A job that has the last release and has no successors and
execution of the job; thus, it will be inserted at a timing that deadline may occur. This job type is generated from nodes
falls within the LET section. with a small period and is always generated after the last
For these reasons, we create separate task queues for exe- release of jobs with the deadline. Its laxity must be greater
cution and communication tasks. These task queues are sorted than the maximum laxity of the exit job. We give a greater
according to priority; therefore, two tasks derived from the laxity than that of the job with a deadline and the last release
same LET task are executed in the same order in each task as follows. The technique of determining this laxity does not
queue. In the processor-selection phase, the execution core break the job release time order and job dependencies.
and execution timing are first determined from the queue for laxity(ni,last ) = max {D(nk,last )} + e(ni,last ), (14)
k∈DSet
the communication processing task. Then, an execution task in
the execution processing task queue is scheduled. This method where DSet represents a set of jobs with deadlines.
stems from the concept that the schedule determinism will be The laxity of a job with a successor must be smaller than
maintained once start and end of the LET are determined. The that of the successor and later released jobs as follows.
execution time of each process is expressed follows. laxity(ni,j ) = min{laxity(ni,j+1 ) − e(ni,j ),
min {laxity(nk,l ) − comm(ni,j , nk,l )} − e(ni,j )}, (15)
nk,l ∈succ(ni,j )
e(ni,j ) = comp(ni ), (8)
The laxity for jobs with successors that have the last release
r(ni,j ) = (datari,j /Dslot  × Tslot ), (9) is defined as follows.
w(ni,j ) = (dataw
i,j /Dslot  × Tslot ), (10) laxity(ni,last ) = min {laxity(nk,l )
nk,l ∈succ(ni,last )
(16)
For execution processing time e(ni,j ), the computation time − comm(ni,last , nk,l )} − e(ni,last ),

of the task obtained by the offline analysis is used without By considering job priorities, laxity makes it possible to
modification. Here, read time r(ni,j ) and write time w(ni,j ) assign priorities in consideration of application deadlines. In
can be calculated based on the amount of data required for addition, the calculation is performed in order from the end
reading and writing. node; thus, job dependencies can be protected.

53
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Algorithm 1: Processor allocation to avoid contention


Input: Current job ni,j , and schedule results to date EF T (ni,j ) = EST (ni,j ) + LET (ni,j ), (20)
Output: Start time of job execution, and execution core
1 min = M AX; Similarly, the start time of the write processing ST (Wi,j )
/* Upper limit of unit time */
2 foreach core p ∈ P do is calculated as follows.
3 Equation(17);
ST (Wi,j ) = F T (Wi,j ) − w(ni,j ), (21)
4 while memory ok() do
5 EST = EST + 1;
/* Find the timing when memory contention does As a result, the job EST is obtained for all cores, and the
not occur */ mapping is determined for core with the smallest EF T .
6 end
7 if min > EST then With the above formula, the schedule of the communication
8 min = EST ; processing task is completed. Next, the execution processing
9 mapping core = p;
10 end is inserted into the idle time in each LET section. The length
11 end of the LET section is derived from the maximum execution
12 EST (ni,j ) = min;
/* Map to the core with the earliest start time */
time of the task (Section III-B); thus, there is always an idle
13 EF T (ni,j ) = EST (ni,j ) + LET (ni,j ); time into which the execution processing time can be inserted.
Unless there is a particular problem, the execution process
D. Processor-selection Phase
begins immediately after the reading process is completed.
Next, an execution core and execution timing are determined ST (Ei,j ) = F T (Ri,j ), (22)
from a high priority job, i.e., a low-laxity job. These are
determined using EST (Earliest Start Time) and EFT (Earliest F T (Ei,j ) = ST (Ei,j ) + e(ni,j ), (23)
Finish Time). V. D EADLINE - MISS
First, a processor is determined from the communication
To reduce deadline-miss, we propose a method to reduce
processing task queue. Here, the EST of LET task EST (ni,j ),
the latency to run the application. There may be a LET task
i.e., the start time ST (Ri,j ) of the reading process, is calcu-
with more idle time than is required, and such tasks may affect
lated as follows.
deadline-miss. Therefore, we reduce the idle time of the LET
EST (ni,j ) = ST (Ri,j ) task step by step, and we analyze the effect of the length of
= min max (trt(ni,j ), avail[p], max(EF T (nk,l ))), (17)
p∈P nk,l ∈P red(ni,j ) the LET section. In addition, we propose an offloading method
that performs distributed processing on multiple cores for tasks
Here, avail[p] indicates the available time of the target core p. that can be parallelized.
If there are predecessor nodes with dependencies, a job cannot
begin processing until the LET section of its predecessor node A. Reduction of LET Section
has been completed. In addition, when the idle time of the core The LET section follows the node WCET using margin β
is compared to the maximum value of the completion of the (Section III-B); therefore, depending on the given task, it is
preceding node, the maximum value is the minimum value of possible that an excessively long LET section will be set. Here,
the time at which the task can be started. we will explain how to reduce latency to eliminate deadline-
When the provisional EST value is obtained, it is necessary miss by gradually reducing the idle time in the LET section.
to determine whether memory access has occurred during the Recall that margin β is used to set the length of the LET
job’s read processing time. If memory access occurs, the start section; therefore, first, the length of the LET section of the
time of the LET task is delayed until the memory access task is reduced to W CET (ni ) ∗ (1 + β). In this case, if
does not overlap by incrementing the EST value. Algorithm a deadline-miss occurs, the ratio of the margin is further
1 describes a processor allocation process that avoids memory reduced. For example, if the LET section is set by adding a
contention. For each core, use Equation (17) to find EST 20% margin of the WCET of the task, the margin is reduced
of a job. Then use function memory ok() to check if the from 20% to 10%. In other words, the margin used to set
LET task’s memory access (i.e., read and write) conflicts the LET section is reduced by 10%, and each time, the
with already scheduled jobs. The function returns 1 if there schedule is redone, and we determine whether a deadline-
is contention, 0 otherwise. Increase EST as long as memory miss has occurred. Eventually, if there is no idle time in
access contention occurs. From the job start time, the memory the LET section and deadline-miss still occurs, scheduling is
access timing can be easily obtained as in the following considered impossible. We gradually reduce the margin in this
equations. When the EST of each core is obtained, the job manner because we seek to exploit the advantages of LET. It
is executed on the core with the smallest EST. is possible to flexibly cope with changes in the execution time
When the EST is obtained, the end time of the read of a task or the like depending on the idle time in the LET
processing task and EFT of the LET task, i.e., the end time of section.
the write processing F T (Wi,j ), can be obtained as follows. B. Offloading LET Task Method
F T (Ri,j ) = ST (Ri,j ) + r(ni,j ), (18) Here, we describe the method to offload a task to another
core to reduce latency. First, we introduce Amdahl’s law to
EF T (ni,j ) = F T (Wi,j ) = EST (ni,j ) + LET (ni,j ), (19) determine the calculation time for distributed task processing.

54
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

݊ଵ Algorithm 2: Proposed Algorithm


Input: A multi-rate DAG, cores
Output: Schedule results for the DAG, i.e., start times and finish times of all
‫ܨ‬ଶ jobs, execution cores, and number of execution cores
:RUNGLVWULEXWLRQ 1 Generate jobs in a hyperperiod and redefining job dependency (IV-A);
݊ଵ 2 Calculate a laxity for each job (IV-C);
3 Generate execution task list and enqueue jobs according to an increasing order of
2IIORDGLQJ
3DUDOOHOSURFHVVLQJ laxity (IV-B);
݊ଶ 4 while Scheduling results are given for all patterns of offloading cores do
ܱଶ ܱଶ ܱଶ ʞ ܱଶ 5 if parallel nodes exist then
6 Select the number of cores using offloading for which no schedule
݊ଷ result has been obtained (V-B);
7 end
‫ܤ‬ଶ 8 while A job exists in the Priority Queue do
5HVXOWVXPPDUL]DWLRQ
9 Dequeue the job ni,j which has the minimum laxity in
P riority Queue;
10 if parallel node then
݊ଷ
11 produce offloading tasks(V-B);
12 end
Fig. 6. Process flow of parallel node. 13 Compute the EST and EFT for ni,j (IV-D);
Amdahl’s law: Amdahl’s law reveals the performance 14 Allocate ni,j to the core that gives the smallest EFT with insertion
based policy(Algorithm 1);
improvement rate achieved by a specific computer system 15 end
when the degree of parallelism of the computer is increased. 16 end
17 Select a result that does not miss a deadline among all patterns;
The performance improvement rate S(N ) is expressed as 18 if deadline-miss then
follows: 19 if margin rate == 0 then
non-schedulable;
1 20
S(N ) = , (24) 21 end
(1 − K) + K
N 22 else
23 reduce a margin rate (V-A);
where K is the execution time ratio of the entire parallelizable 24 return line 1;
component, 1 − K is the execution time ratio of the nonpar- 25
26 end
end

allelizable component, and N is the number of cores.


The Amdahl method provides a performance improvement time of the preceding and succeeding nodes. Assuming the
rate when a single core is distributed; therefore, the computa- front node is Fi , the execution time of each phase is calculated
tion time of the offloading node Oi is obtained as follows. as follows.
e(ni ) r(Fi ) = w(Fi ) = (datari /Dslot  × Tslot ), (26)
e(Oi ) = , (25)
S(N )
e(Fi ) = (1 + δ) ∗ (datari /Dslot  × Tslot ), (27)
This represents the computation time when distributed pro- Similarly, for back node Bi , each execution time can be
cessing is performed; however, in practice, there should be a calculated as follows.
node that distributes work before distributed processing and
r(Bi ) = w(Bi ) = (datawi /Dslot  × Tslot ), (28)
a node that aggregates results after distributed processing. In
e(Bi ) = (1 + δ) ∗ (dataw
i /Dslot  × Tslot ), (29)
addition, we must analyze the communication phase of tasks
to be distributed. Therefore, we provide an offloading model The communication processing time of each task when
to determine the time required for distributed processing. offloading can be calculated in the same manner. In the reading
Offloading model process, the data written in the front node are read; thus, the
The proposed offloading model is visualized in Fig. 6. As read processing time is equal to each communication process-
can be seen, parallelizable node n2 is distributed processing ing time of the front node. Similarly, the write processing time
as multiple offloading tasks. In addition, a new node is created is equal to each communication time of the back node.
before and after distributed processing is performed. The
r(Oi ) = r(Fi ) = w(Fi ), (30)
front node has divided the work of the task, and the rear w(Oi ) = r(Bi ) = w(Bi ), (31)
node aggregates the distributed processing results. Here, the
execution processing time of distributed processing can be The LET model is applied to the offloading tasks Oi , as
calculated using Amdahl’s law. We assume that the amount well as the front and back nodes. Each communication phase
of communication data of each offloading task depends on in the offloading model is also scheduled such that memory
the given application, e.g., the amount of communication data access does not overlap.
is distributed equally according to the number of offloading Determining the number of cores for offloading
tasks. Therefore, we inherit the amount of communication data Here, we describe how to determine the number of cores
before offloading as the amount of communication data for for distributing parallelizable tasks. Note that determining
each offloading task. As a result, the worst case communica- the number of cores to be distributed depends on the given
tion time in offloading can be considered. application and developer. It is difficult to theoretically de-
The execution time of nodes before and after offloading termine the optimal number of cores to be distributed. Thus,
tasks is application-dependent; therefore, we derive execution we proposed a method to change the number of cores for
processing time in proportion to the amount of data to be distributed processing and select the best solution from among
communicated. Here, we use margin δ to set the execution the solutions in all cases. Here, all jobs of the parallelizable

55
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)
120000
   
core 0 ݊଴,଴ ݊ଶ,଴ ݊ସ,଴ ݊଴,ଵ ݊ଶ,ଵ ݊ସ,ଵ 100000

time unit
         80000

core 1 ݊ଵ,଴ ݊ଶ,଴ 60000


݊ଶ,ଵ
     40000

core 2 20000
݊ଶ,଴ ݊ଶ,ଵ
    0
0 0.1 0.2 0.3 0.4 0.5
core 3 ݊ଶ,଴ ݊ଶ,ଵ

LET margin β
  
core 4 makespan deadline
݊ଷ,଴
    Fig. 8. Impact of LET margin on schedule.
+\SHUSHULRG
Task before offloading (i.e., makespan) could be reduced by the proposed method.
Offloading task As margin β of LET increases, makespan increases as shown
Task after offloading
in Fig. 8. In Fig. 8, DCT verif y is scheduled while changing
Fig. 7. Schedule result of sample DAG. LET margin β, and the obtained makespan is shown. Here,
we simply schedule LET tasks to avoid contention, and do not
TABLE I reduce LET interval or offload tasks. We call such a schedule
TASK GRAPH CHARACTERISTICS GENERATED BY TGFF.
the normal schedule. Here, the deadline margin γ was set to
#Task-graphs 100 0.2. As can be seen, if the LET interval is too large, the set
#Tasks <40, 70, 100 >(min, avg, max)
deadline may not be satisfied. Therefore, we set a LET section
WCET [1;2000]
Exchanged data [1;100] and a deadline, and we investigated satisfying the deadline
Maximum in-degree 3 while reducing the LET section as little as possible.
Maximum out-degree 3
#Entry nodes <2, 5, 8 >
(offloaded) task must be executed using the determined cores In Fig. 9, the reduction amount of the schedule by proposed
and the number of cores. methods (LET reduction, Offloading, and both) with respect to
The pseudocode of the proposed algorithm is given as the normal schedule is shown as gain. Here, we set margins as
Algorithm 2. The algorithm keeps the result of parallelization follows: β = 0.5, γ = 0.2. Above the bar of ”LET Reduction”,
with 1-16 cores (lines 4-16) and adopts the number of cores LET margin β when deadline-miss disappears is written. On
that yields the best schedule (line 17). If deadline-miss has the other hand, above the bar of ”Offloading”, the number of
occurred even after offloading has been performed, the idle cores used for task offloading is shown. Finally, above the bar
time in the LET section is reduced gradually (lines 18-26). of ”Offloading + LET Reduction”, both LET margin β and
Fig. 7 shows the results of scheduling the DAG in Fig. 5. The the number of core are shown.
memory access phases in all LET tasks are adjusted so that
they do not overlap. In addition, it is derived from the proposed
As the margin of the deadline becomes smaller, the deadline
algorithm that the offloading of parallel nodes is best when
also becomes smaller. Therefore, it is difficult to satisfy
distributed with four cores. As can be seen from the figure,
the deadline using a normal scheduling method. However,
multiple jobs generated from the same task are executed using
when the deadline setting is severe (i.e., deadline margin
the same core and the same number of cores.
γ is smaller), much idle time of the LET is reduced. This
VI. E VALUATION reduces the benefits of the LET model. With the offloading
A. Simulation Method method, the schedule is reduced by distributing parallel nodes
across multiple cores without reducing LET sections. Here, we
As input data, we used applications from the StreamIT offloaded the node with the longest computation time in the
benchmark suite modeled as fork-join graphs [9] [10]. Further- DAG. In addition, when multiple parallel nodes are present,
more, we used task graphs generated by Task Graphs For Free the schedule can be reduced further. However, the amount of
(TGFF) [11]. TGFF can generate random DAGs for various schedule reduction depends on the position of the offloaded
parameters, such as the number of tasks, the number of entry node in the graph, and some graphs have no effect (e.g.,
nodes, and the maximum in-degree and out-degree. We set the Beamf ormer and DCT comp). The number of cores used
experimental parameters as shown in Table I. for distributed processing varied depending on the graph, and
The evaluation was performed using a single cluster of the there were 1 to 13 cores. On the other hand, the proposed
Kalray MPPA, i.e., 16 cores. We also assume that one node method (i.e., offloading + LET reduction) reduces the LET
in the original DAG, which has the longest computation time, section until the deadline is satisfied after task offloading. As
is a parallel node and can be offloaded. Throughout, we set a result, a schedule that satisfies the deadline can be generated
margins as follows: α = 1.0, δ = 0.2, K = 0.7, Tslot = 3, without excessively reducing the LET section. If there are
and Dslot = 10. no tasks with greater WCET, the offloading method may not
B. Evaluation Results for Benchmark Application work well; therefore, it is preferable to select an appropriate
StreamIt Benchmark is fork join graphs, that is, one DAG scheduling method so as to protect the deadline according to
has only one entry node and exit node, respectively. Therefore, the application. For all benchmarks, the algorithmic resolution
it was used to evaluate just how much the schedule length time was less than 20 minutes.

56
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

0.45 0.2, 10
0.5, 12 0.0, 1
0.40 0.5, 11 0.3, 4
0.4, 9 0.4, 9 0.0, 4 0.1, 1 0.1, 12
0.1, 1 0.1, 2 0.0
0.35 0.1 1 0.2, 9 11
0.5, 13 0.5, 8 12 0.0

0.30 10
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1

Gain (%)
0.25 13 8 9
0.0

0.20 9 4

0.15

0.10 9

0.05
2 12
1 1 2 1 1
0.00

LET margin The number of cores used for offloading LET margin, the number of cores used for offloading
LET Reduction Offloading Offloading + LET Reduction
Fig. 9. Schedule reduction by proposed method.
1
0.9 For example, Igarashi et al. [12] proposed a list scheduling
0.8
based heuristic technique for the Kalray MPPA2-256 Bostan.
Deadline-miss ratio

0.7
0.6 They performed scheduling for parallel computations accord-
0.5
0.4 ing to the amount of task computation, and they successfully
0.3
0.2 reduced the makespan over the existing method. Rouxel et
0.1
0
al. [9] [13] introduced heuristic contention-aware scheduling
0.1 0.2 0.3 0.4 0.5 strategies that generate a time-triggered schedule for applica-
deadline_margin γ tion tasks.
normal Proposed method (offloading + LET Reduction) In addition, when using a multi-/many-core processor, con-
Fig. 10. Impact of deadline margin γ. tentions for shared resources can hinder real-time performance,
C. Evaluation Results for Multi-rate DAGs which is a significant problem. Many studies have improved
Next, we evaluate task graphs created by TGFF. We evaluate the predictability of access timing to shared memory by divid-
the performance of the proposed method using the average ing a task into an execution phase and multiple communication
value of the schedule results for each task graph. Unlike the phases (e.g., PREM (PRedictable Execution Model) [14] [15]
benchmark application, These are evaluations for multi-rate and read-execute-write semantics [6] [8] [9]). As a result, it
DAGs that have multiple periods. We observe changes in is possible to avoid contention, accurately estimate delay, and
the deadline-miss ratio due to changes in each margin. The to increase application execution speed. From a hardware per-
deadline-miss ratio means the number that could not meet the spective focusing on NoC communication, a division strategy
deadline of the end node in each DAG. that reduces contention has been proposed previously [16].
The fluctuation of the deadline-miss ratio when the deadline Perret et al. [17] provided an execution model that limits the
margin γ is changed is shown in Fig. 10. Here, we set margins behavior of the application on the platform in order to perform
as follows: β = 0.5. In addition, the tendency can be observed temporal isolated partition mapping on Kalray MPPA2-256
by fixing the number of period values to two. The smaller the Bostan.
deadline margin, the more severe the deadline. Furthermore,
The LET paradigm is attracting significant attention as a
the created graphs have many end nodes after jobs are created.
way to consider contention on multi-/many-core processors.
Therefore, a large number of deadline-misses will occur in
The LET paradigm was originally proposed within a pro-
a normal scheduling method that simply prevents memory
gramming language for embedded systems [4]. Recently, to
contention. On the other hand, the proposed method that
bring determinism to a real-time system, research into an
offloads parallel nodes and reduce the idle time of LET section
autonomous driving system has been conducted [5].
can significantly reduce the deadline-miss ratio. This method,
which avoids contentions and leaves the LET section as much Note that the number of tasks that use memory increases
as possible, is considered to be very effective in hard real- when adopting the LET model; therefore, memory usage is
time application development. Margins α and δ had very little expected to increase. To address this problem, a memory usage
effect on the entire schedule, and it was not possible to observe reduction method using double buffering has been proposed
the tendency of the change in deadline-miss ratio due to the previously [18] [19]. Biondi et al. [20] described how to apply
change in these margins. the LET model to the AUTOSAR model for implementation
VII. RELATED WORK in actual automotive systems using multi-core platforms.
Task scheduling using multi-/many-core platforms is gener- In an automobile engine management system, end-to-end
ally considered an NP-hard problem. Thus, heuristic schedul- latency, including execution and communication of multiple
ing algorithms have been studied extensively, and most of such tasks, must be constant. To address this, Jorge et al. [21]
algorithms are based on list scheduling. employed offset allocation to reduce output jitter when using

57
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the LET model, which contributed to improving real-time per- ACKNOWLEDGMENT


formance. Hamann et al. proposed a method to calculate task This work was partially supported JST PRESTO Grant
communication costs when applying the LET paradigm [22]. Number JPMJPR1751.
Their evaluation demonstrated that LET communication is R EFERENCES
extremely deterministic and reduces communication overhead, [1] S. Kato, S. Tokunaga, Y. Maruyama, S. Maeda, M. Hirabayashi, Y. Kit-
sukawa, A. Monrroy, T. Ando, Y. Fujii, and T. Azumi, “Autoware on
although communication time was greater than the existing board: Enabling autonomous vehicles with embedded systems,” in Proc.
communication method on the multi-/many-core platform. of ICCPS, 2018, pp. 287–296.
Research into LET is generally focused on engine control, [2] Y. Maruyama, S. Kato, and T. Azumi, “Exploring Scalable Data Allo-
cation and Parallel Computing on NoC-Based Embedded Many Cores,”
and little attention has been paid to scalability by the number in Proc. of ICCD, 2017, pp. 225–228.
of cores. By applying the LET model to the application, [3] A. Munir, S. Ranka, and A. Gordon-Ross, “High-performance energy-
task execution time may increase. Since multiple applications efficient multicore embedded computing,” IEEE Transactions on Parallel
and Distributed Systems, vol. 23, no. 4, pp. 684–700, April 2012.
operate simultaneously in self-driving systems, it is necessary [4] T. A. Henzinger, B. Horowitz, and C. M. Kirsch, “Giotto: a time-
to consider the deadlines of multiple tasks. We addressed these triggered language for embedded programming,” Proceedings of the
issues using multi-/many-core processors and contribute to IEEE, vol. 91, no. 1, pp. 84–99, Jan 2003.
[5] C. M. Kirsch and A. Sokolova, “The logical execution time paradigm,”
the scalability of the LET model for large-scale applications, in Advances in Real-Time Systems, 2012.
e.g., self-driving systems. For that purpose, it is necessary to [6] M. Becker, S. Mubeen, D. Dasari, M. Behnam, and T. Nolte, “Schedul-
apply the LET model and schedule and map application tasks ing multi-rate real-time applications on clustered many-core architec-
tures with memory constraints,” in Proc. of ASP-DAC, 2018, pp. 560–
appropriately. 567.
In this paper, we aimed to improve scalability by per- [7] B. D. De Dinechin, D. Van Amstel, M. Poulhiès, and G. Lager, “Time-
forming distributed processing of tasks that applied the LET critical computing on a single-chip massively parallel processor,” in
Proc. of DATE, 2014, pp. 1–6.
paradigm using multi-/many-core processors. We modeled the [8] M. Becker, D. Dasari, B. Nicolic, B. Akesson, V. Nélis, and T. Nolte,
application as a DAG and applied the LET model to each “Contention-free execution of automotive applications on a clustered
node. In addition, we learned from the above-mentioned time- many-core platform,” in Proc. of ECRTS, 2016, pp. 14–24.
[9] B. Rouxel, S. Derrien, and I. Puaut, “Tightening contention delays
triggered schedule and adjusted so that memory accesses while scheduling parallel applications on multi-core architectures,” ACM
would not occur simultaneously. The time-triggered schedule Transactions on Embedded Computing Systems (TECS), pp. 164:1–
using LET can adjust the schedule with finer slots than when 164:20, 2017.
[10] B. Rouxel and I. Puaut, “STR2RTS: Refactored StreamIT benchmarks
LET is not used. In addition, even if the task execution time into statically analysable parallel benchmarks for WCET estimation
increases or decreases, there is no need to change the schedule & real-time scheduling,” in Proc. of OASIcs-OpenAccess Series in
significantly and the behavior of the application does not Informatics, vol. 57, 2017.
[11] R. P. Dick, D. L. Rhodes, and W. Wolf, “TGFF: task graphs for free,”
need to be changed. As a result, the method proposed in this in Proc. of Workshop on CODES/CASHE, 1998, pp. 97–101.
paper performs LET task scheduling without contention, and [12] S. Igarashi, Y. Kitagawa, T. Ishigooka, T. Horiguchi, and T. Azumi,
shortens the schedule or deadline-miss while keeping the LET “Multi-rate DAG scheduling considering communication contention for
NoC-based embedded many-core processor,” in Proc. of DS-RT, 2019.
interval as long as possible. [13] B. Rouxel, S. Skalistis, S. Derrien, and I. Puaut, “Hiding Communica-
tion Delays in Contention-Free Execution for SPM-Based Multi-Core
Architectures,” in Proc. of ECRTS, 2019, pp. 1–24.
VIII. C ONCLUSIONS AND FUTURE WORK [14] R. Pellizzoni, E. Betti, S. Bak, G. Yao, J. Criswell, M. Caccamo, and
R. Kegley, “A predictable execution model for cots-based embedded
In this paper, we have proposed a theoretical scheduling systems,” in Proc. of RTAS, 2011, pp. 269–279.
algorithm for a DAG with multiple periods in consideration [15] A. Alhammad and R. Pellizzoni, “Time-predictable execution of multi-
of communication contention and offloading tasks. To prevent threaded applications on multicore systems,” in Proc. of DATE, 2014,
pp. 1–6.
communication contention, the proposed method applies the [16] M. Becker, B. Nikolic, D. Dasari, B. Akesson, V. Nélis, M. Behnam,
LET model to a DAG application and coordinates memory and T. Nolte, “Partitioning and analysis of the network-on-chip on a
access phases. We have also proposed methods to schedule COTS many-core platform,” in Proc. of RTAS, 2017, pp. 101–112.
[17] Q. Perret, P. Maurere, E. Noulard, C. Pagetti, P. Sainrat, and B. Tri-
multi-rate DAGs and generate jobs within a hyperperiod quet, “Temporal isolation of hard real-time applications on many-core
and define their dependencies. In this study, we considered processors,” in Proc. of RTAS, 2016, pp. 1–11.
deadline-miss a problem and proposed the scheduling method [18] S. Resmerita, A. Naderlinger, and S. Lukesch, “Efficient realization
of logical execution times in legacy embedded software,” in Proc. of
that performs LET section reduction and task offloading. The MEMOCODE, 2017, pp. 36–45.
experimental results indicate that the proposed algorithm can [19] M. Ogawa, S. Honda, and H. Takada, “Efficient approach to ensure
improve the deadline-miss rate of a multi-rate DAG compared temporal determinism in automotive control systems,” in Proc. of ISED,
2018, pp. 53–57.
to normal LET task scheduling. We also observed a maximum [20] A. Biondi and M. Di Natale, “Achieving predictable multicore execution
improvement of about 40% in the length of the schedule. The of automotive applications using the LET paradigm,” in Proc. of RTAS,
proposed method is considered to be an effective method for 2018, pp. 240–250.
[21] J. Martinez, I. Sanudo, and M. Bertogna, “Analytical characterization of
hard real-time applications because it reduces the length of end-to-end communication delays with logical execution time,” IEEE
the schedule while leaving the free time in the LET section Transactions on Computer-Aided Design of Integrated Circuits and
as much as possible. Systems, vol. PP, pp. 1–1, 07 2018.
[22] A. Hamann, D. Dasari, S. Kramer, M. Pressler, and F. Wurst, “Commu-
In the future, we plan to investigate scheduling methods that nication centric design in complex automotive embedded systems,” in
consider both multiple clusters and memory capacity. Proc. of ECRTS, 2017, pp. 10:1–10:20.

58
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

NUMA-Aware Non-Blocking Calendar Queue

Maryan Rab∗ , Romolo Marotta† , Mauro Ianni† , Alessandro Pellegrini† , Francesco Quaglia∗†
∗ University of Rome “Tor Vergata”, Italy
Email: maryan.rab@gmail.com, francesco.quaglia@uniroma2.it
† Lockless S.r.l., Rome, Italy
Email: {marotta, ianni, pellegrini}@lockless.it

Abstract—Modern computing platforms are based on multi- atomic Read-Modify-Write (RMW) instructions offered by
processor/multi-core technology. This allows running applica- the ISA—to let concurrent threads gather information on
tions with a high degree of hardware parallelism. However, whether conflicting accesses to shared data structures have
medium-to-high end machines pose a problem related to the occurred—also allows resilience of performance degradation
asymmetric delays threads experience when accessing shared in CPU-stealing context, like Cloud computing.
data. Specifically, Non-Uniform-Memory-Access (NUMA) is However, modern hardware platforms are also char-
the dominating technology—thanks to its capability for scaled- acterized by asymmetries, which play as well a role in
up memory bandwidth—which however imposes asymmetric the actual performance deliverable by parallel/concurrent
distances between CPU-cores and memory banks, making an applications. One of the most important asymmetries is
access by a thread to data placed on a far NUMA node severely the so-called Non-Uniform-Memory-Access (NUMA). It is
impacting performance. In this article, we tackle this problem based on having memory banks organized in a configuration
in the context of shared event-pool management, a relevant where each processor has some close bank(s)—this is the
aspect in many fields, like parallel discrete event simulation. local NUMA node—and more far ones—which form the
Specifically, we present a NUMA-aware calendar queue, which far NUMA nodes. Consequently, the need for accessing data
also has the advantage of making concurrent threads coordi- from far NUMA nodes induces higher latency and traffic on
nate via a non-blocking scalable approach. Our proposal is the memory-interconnection hardware components. In these
based on work deferring combined with dynamic re-binding
architectures, locality in the accesses not only plays a role
for cache exploitation, but also for RAM exploitation, since
of the calendar queue operations (insertions/extractions) to
accesses to far RAM banks should be avoided as much as
the best suited among the concurrent threads hosted by the
possible.
underlying computing platform. This changes the locality of
The challenges posed by multi-processor/multi-core
the operations by threads in a way positively reflected onto
NUMA platforms have been faced since long time in the
NUMA tasks at the hardware level. We report the results
literature. In fact, most Operating System (OS) implemen-
of an experimental study, demonstrating the capability of
tations offer API to directly control the placement of logi-
our solution to achieve the order of 15% better performance cal pages to RAM memory—or dynamically migrate them
compared to state-of-the-art solutions already suited for multi- across NUMA nodes if required. Also, OS-level solutions
core environments. have the capability to migrate threads, and the data they are
Index Terms—NUMA, calendar queue, non-blocking data currently touching, to favor accesses to the nearest (local)
structures NUMA node of a given CPU core.
However, OS-level solutions only provide mid/long term
1. Introduction binding between threads/data and NUMA nodes. Further-
more, the concept at the base of these solutions is to pack
The current trend in computing architectures is charac- threads and their hot data on a same NUMA node, which
terized by an ever-increasing core-level parallelism. This is is a solution not adequate for the case of very large thread
motivated by the need for scaled-up computing capabilities counts—and CPU-bound threads—which share very large
in face of the physical limits imposed by transistors tech- amounts of logical memory, possibly performing frequent
nology [1], [2], [3]. This trend has brought concurrent and fine grain operations on it. This is the case of last genera-
parallel programming paradigms to become mandatory for tion parallel simulation platforms, especially those based on
current and next-generation applications. Furthermore, it has speculative processing schemes [5]. In these scenarios, the
brought non-blocking thread coordination [4] to assume a “same NUMA-node” packing approach does not work since
central role in the design and implementation of modern threads would simply be brought to compete for the same
concurrent applications. Incidentally, this type of coordi- CPU-cores, leading to performance degradation.
nation, which avoids critical sections and simply exploits Based on the above considerations, we feel that the

978-1-7281-7343-6/20/$31.00 ©2020 IEEE


59
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

NUMA aspect should be directly incorporated into the de- A different approach has been provided in [11]. It is
sign of algorithms for managing shared data. Hence, in this based on a general technique for NUMA awareness, called
article we present a NUMA-aware design of a shared event- Fast Fly-weight Delegation (FFWD) introduced in [12],
pool based on the Calendar-Queue archetype. Our solution which resorts to a dedicated server thread to operate on
explicitly controls the locality of the accesses to hardware remote memory banks in a NUMA topology. With this
level memory resources, namely NUMA nodes. This is done solution, locality at the hardware level is achieved since
by relying on cross-thread insertions of elements in the data hosted by a given NUMA node are only touched
calendar, where the thread starting the insertion will not by specific server threads, which are bound to CPU-cores
complete it in case the target time-bucket of the calendar is on that NUMA node. However, this approach is implicitly
hosted by a far NUMA node. In these scenarios, we adopt a blocking, since a thread that asks a server thread to operate
deferred-work strategy, leading other threads participating in on the data structure is blocked until the server reply ar-
the application, which are hosted by those far NUMA nodes, rives. Contrarily, our solution is fully non-blocking—hence
to actually finalize (flush) these insertion operations. On the it is implicitly more scalable—and does not require pre-
other hand, we explicitly control the delay in the deferring partitioning of threads into clients and servers (with respect
scheme so to avoid that, when the elements whose insertion to NUMA oriented data access).
was deferred need to be extracted, then the whole burden The issue of memory-access latency asymmetries in
of managing the flush of the deferred insertions related to NUMA architectures has also been tackled by using strate-
the target time-bucket is put to an inconvenient thread— gies where the same data are replicated across multiple
one running on a far NUMA node. In our scheme, we do NUMA nodes [13]. This makes them fast accessible to
not only share the data structure among threads, but we also threads running on CPU-cores hosted by whatever NUMA
share the work to be done on the data structure so as to make node, at the cost of using mechanisms for making the
it be carried out by the most convenient threads. Our solution replicated data instances coherent—this cost may become
relies anyhow on non-blocking coordination of the threads prohibitive for intensive and/or fine grain data update op-
in all of their operations, including the ones of posting the erations. In our solution we avoid at all this cost since we
deferred work, and the ones of flushing it. This enables us do not use replication. Furthermore, we are able to manage
to actually achieve non-blocking insertions/extractions from NUMA optimized accesses in scenarios with fine-grain tasks
the calendar, which has already been shown to play a core operating in update mode on the data structures—in fact,
role in concurrent event-pool management applications [5]. insertions and extractions from LFDWCQ are actual update
Based on all its features, we have called our data structure operations.
as Lock-Free Deferred-Work Calendar Queue (LFDWCQ). As for the specific problem we tackle in this article,
We also report data for an experimental comparison of namely event-pool management, a wide literature exists on
LFDWCQ with state-of-the-art non-blocking versions of the making the event pool efficient—in terms of both asymptotic
Calendar Queue—which are however not NUMA-aware— and actual costs—like Calendar [14], Ladder [15] and LOCT
showing how our proposal can achieve up to 15% better [16] queues. Furthermore, enhancements of these data struc-
performance when running a classical event-pool benchmark tures (or of other data structure flavors like lists or trees)
on top of a commodity machine equipped with 32 CPU- have been proposed for the case of concurrent accesses, in
cores and 64GB of memory organized in 8 NUMA nodes. particular by making the data structures accessible via non-
The remainder of this article is structured as the fol- blocking algorithms that enable scalability (see, e.g., [17],
lowing. In Section 2 we discuss related work. LFDWCQ is [18], [19], [20], [21]). However, the proposed algorithms
presented in Section 3. In Section 4 we report experimental have no intent to improve the locality of accesses with
results. respect to architectures characterized by highly asymmetric
memory, such as NUMA platforms. Hence, our proposal
is an improvement over these literature solutions, as we
2. Related Work also demonstrate via experimental results. In fact, beyond
NUMA awareness, we also retain the lock-freedom prop-
As pointed out, an approach to cope with NUMA is erty, since our data structure provides fully non-blocking
based on OS level (or middleware level) facilities that operations
dynamically place threads and their working set of logical
pages on a same NUMA node [6], [7], [8], [9], [10]. These 3. Lock-Free Deferred-Work Calendar Queue
approaches have been shown to be effective in scenarios
where threads actually express locality in the access to LFDWCQ is built on top of a non-blocking and conflict-
groups of logical pages. They are not suited for scenarios resilient Calendar Queue [21] (CRCQ). This data structure
where the access pattern to data is highly variable, like when splits the domain of event timestamps into partitions, called
there is no stable binding of tasks by threads to portions virtual buckets, and maps them to a circular array of ordered
of the shared data [5]. We cope with this limitation for the linked lists, denoted as physical buckets. Also, the number
case of fully shared event-pool management, which is a core of events per bucket is guaranteed to be bounded by a con-
aspect in modern simulation systems to be run on top of stant which is independent from the queue size, delivering
multi-core machines. amortized constant-time accesses. Whenever the number of

60
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

physical bucket
DWSTR (non-blocking linked list) (DWSTR), of linked lists. Differently from ordinary calen-
dar queues, virtual buckets are explicitly maintained by the
H H H H H H DWB means of individual nodes, called Deferred Work Buckets
(virtual bucket)
(DWB), within physical buckets. In turn, each node (i.e. a
virtual bucket) maintains deferred events in an unsorted ar-
ray of fixed size, denoted as Deferred Work Values (DWV)1 .
Essentially, our DWQ is a three-dimensional calendar queue
where two dimensions are materialized within arrays and
DWV
(array of events)
one with the usage of ordered non-blocking linked lists [24].
Before items can be extracted, they have to be migrated
Figure 1. Layout of the front-end DWQ. from DWQ to the underlying CRCQ. Generally, inserting an
item into a calendar queue is a cache-unfriendly operation
because it involves the traversal of linked lists (the physical
items is no longer balanced across the physical buckets, a buckets) that are well-known to have a poor spatial locality
resize phase doubles/halves the number of physical buckets and to be sub-optimal in terms of cache usage (at least
in order to restore the balance, so as to keep control over the for large-sized lists). This is exacerbated in the case of
number of steps performed during insertions and extractions NUMA architectures, where a miss into the Last Level of
(since these steps depend on the number of events in each Cache (LLC) might trigger a request to a remote cache
bucket). CRCQ provides all these features in a non-blocking and/or memory component. All these shortcomings are still
fashion and jointly delivers conflict resiliency for extrac- present when migrating events from DWQ to the back-end
tions, which highly contend in the access to the bucket that CRCQ. However, our solution alleviates all these problems
keeps the minimum-timestamp event and are well-known by migrating items falling in the same virtual bucket in
to be critical for any concurrent priority queue because of batch. This avoids repeated traversals of nodes within the
their impact on caches in multi-core platforms [17], [22]. same physical bucket and allows to reuse most of the
We designed our solution in order to maintain the same steps performed during a previous migration of an event,
progress (lock-freedom) and scalability (conflict-resiliency) significantly increasing temporal locality of insertions.
guarantee of the CRCQ. Since extractions are performed from the underlying
In the following sections, we introduce the main idea at calendar queue, we need to ensure that all the items be-
the core of LFDWCQ and describe its actual structure and longing to the current virtual bucket have been migrated
the operations taking place on it. from DWQ, allowing threads to obtain events by resorting
to the original extraction logic provided by CRCQ (events
3.1. The idea in a nutshell with lower timestamps must be extracted before others). To
achieve this goal, threads have to trigger a migration when-
LFDWCQ has three main design principles: ever a bucket becomes the new target for extractions—it
1) postpone (defer) far-future events insertions; becomes hot. However, this reactive approach is not suitable
2) group them in a batch to control locality of memory for scalability and NUMA-awareness. In fact, regardless its
accesses; placement within the NUMA topology, any thread might
3) provide non-blocking progress of threads. trigger such a reactive migration, increasing the probability
of conflicts.
To achieve all these goals, we paired CRCQ with a front-end In more detail, the current bucket (the one keeping the
data structure called Deferred Work Queue (DWQ) aimed at lower timestamp events) has become hot because it is con-
maintaining events whose management has been deferred. currently targeted by all the threads performing extractions.
In more detail, these events are not directly connected to Also, it is frequently updated by memory-write accesses
the underlying calendar queue upon their insertion; rather, to signal item removals. This has a dramatic impact on
they are appended to DWQ in order to be processed later performance because of the costs associated with the cache-
along a more favorable phase of execution of some thread. coherency protocols running on firmware. In order to do
On the other hand, extractions are performed directly from not worsen this already challenging scenario, we adopted a
the CRCQ, which—as mentioned—embeds advanced tech- proactive approach for migrating events from DWQ to the
niques towards conflict resiliency and scalability. Overall, calendar. In particular, instead of migrating items belonging
events inserted into the front-end DWQ will be eventually to the already hot buckets, we flush in advance “mid-
migrated to the underlying (back-end) CRCQ. temperature” virtual buckets, namely those that are in the
In order to make our approach effective, adding items (near) future of the currently hot bucket. This guarantees
to DWQ has to be a low latency and low memory-footprint that, whenever a flushed bucket becomes hot all its items
operation, otherwise the costs for posting events overpass have been already migrated. To further reduce the proba-
the benefits given by their postponed batch-insertion. For bility of conflict during migration phases, we also adopt
this reason, the DWQ layout, shown in Figure 1, is mainly
based on arrays. It resembles the classical calendar queue 1. We could easily support dynamically sized arrays by resorting to lock-
arrangement by having an array, called Deferred Work Struct free dynamic vectors [23], but this is not the main focus of this work.

61
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Algorithm 1 LFDWCQ algorithms


E1: procedure E NQUEUE(event e) INS VAL FLS MVD
E2: res ← DWQ.insert(e)
E3: if res = FAIL then
E4: res ← CRCQ.insert(e) Figure 2. State machine representing the evolution of virtual buckets within
E5: return res the DWQ.
D1: procedure D EQUEUE(void)
D2: current vb ← CRCQ.getIndexOfCurrentBucket()
D3: if current vb = last vb then Since we want to provide non-blocking progress, the
D4: f uture vb ← getWarmBucket(current vb) migration phase requires a careful handling of the virtual
D5: if ¬ DWQ.isMigrated(f uture vb) then bucket states within the DWQ. Their evolution is driven
D6: DWQ.migrate(f uture vb)  Proactive invocation
by the state machine shown in Figure 2, whose states are
D7: if ¬ DWQ.isMigrated(current vb) then
D8: DWQ.migrate(current vb)  Reactive invocation described in the following.
D9: last vb ← current vb INS: indicates that the virtual bucket still accepts new
D10: return CRCQ.dequeue()
events;
R1: procedure R ESHUFFLE(void) VAL: signals that virtual bucket no longer accepts new
R2: for each vb in DWQ do
R3: DWQ.migrate(vb) items and a validation of already inserted items
R4: CRCQ.reshuffle() has started;
FLS: tells us that the validation has completed and
items are being migrated into the CRCQ;
MVD: is used to communicate that all the items have
assignments policy of virtual buckets to NUMA nodes,
been migrated and this instance of a virtual
namely only those threads running on a specific NUMA
bucket can be collected to reclaim its memory.
node can flush a given bucket. Moreover, if multiple threads
detect that they are migrating the same bucket, we allow just The main role of these states is to ensure that items cannot
one thread to proceed while other threads can migrate other be lost during DWQ operations due to the non-atomicity of a
buckets or simply continue in their computation, reducing migration phase. As typical in non-blocking data structures
the probability of conflicts. Finally, such a proactive work [24], the state of a virtual bucket is materialized within the
can be performed only by threads that have invoked the Least-Significant Bits of the conventional next field of a
dequeue API of LFDWCQ. This is a deliberate choice that DWB, which is a node of a linked list. This ensures correct
allows to reduce the actual contention on a very-expensive manipulation in case of concurrent insertions and removals
prone-to-conflicts operation (event extraction) while still of virtual buckets within the DWQ.
performing useful work. The reshuffle phase ensuring that the number of events
We stress again that all these features have been provided is balanced across the physical buckets is not explicitly
in a non-blocking fashion thanks to the usage of custom discussed since we simply resort to the algorithm provided
state machines and algorithms aimed to maintain the same by the underlying CRCQ. Consequently, whenever such
progress and correctness guarantees provided by the under- an algorithm has to be triggered, we simply flush all the
lying CRCQ, namely lock-freedom [4] and linearizability items stored within DWQ into the CRCQ by exploiting the
[25]. migration algorithm to be discussed in Section 3.2.2.

3.2.1. DWQ insertion. The insertion logic (shown in Algo-


3.2. LFDWCQ operations rithm 2) resembles the insertion within a classical calendar
queue. In fact, first we identify the entry of the DWSTR
LFDWCQ exposes the same API of a conventional array that maintains the virtual bucket (DWB) that we want
priority queue. The logic embedded in the API functions is to update by inserting a new item, say the event e. Then,
depicted in Algorithm 1. As hinted, insertions are deferred. we perform a list traversal in order to detect the correct
On the other hand, extractions are never postponed. Con- position of the DWB that will contain e upon insertion
sequently, the latter are first required to check that specific completion. If such a DWB does not exists, we simply
prerequisites have been satisfied before proceeding. In fact, create and connect a new one prefilled with e. Conversely,
we have to be sure that items belonging to the current if it exists, we first check if its current state is compliant
virtual bucket—the one targeted by the extraction—have with a new insertion, namely its state is INS. In this case,
been migrated from DWQ to CRCQ. This step can be per- we proceed by inserting the item within the DWV array
formed in two different ways: proactively, namely migrating associated with the target DWB. This is done with just
items associated with a virtual bucket which is not the one two atomic instructions: one Fetch&Add to get a free
currently used for extractions, and reactively in the opposite index within DWV and one Compare&Swap to insert the
case. Such migrations introduce additional activities not item within the acquired entry. If the latter succeeds the
required in the original extraction operations characterizing operation is completed and we can return. Conversely, if it
CRCQ, which we exploited as a back-off scheme to reduce fails or the Fetch&Add returns an index outside the array
contention and conflicts upon item removals. boundaries, we simply proceed with the classical insertion

62
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Algorithm 2 DWQ Insertion algorithm Algorithm 3 DWQ Migration algorithm


I1: procedure I NSERT(event e) M1: procedure M IGRATE(DWB b)
I2: vb idx ← computeIndex(e) M2: while (state ← getState(b)) = MVD do
I3: bucket ← getDWB(vb idx) M3: if state = INS then
I4: state ← getState(bucket) M4: Compare&Swap(&b.state, INS, VAL)
I5: if state = INS then M5: if state = VAL ∧ b.pending items = 0 then
I6: return FAIL M6: tmp ← Fetch&Add(&b.index, MAX DWB SIZE)
I7: cell idx ← Fetch&Add(bucket.index) M7: tmp ← tmp% MAX DWB SIZE
I8: cell ptr ← &bucket.DWV[cell idx] M8: b.pending items ← tmp
I9: if cell idx ≥ bucket.size then M9: if state = VAL ∧ b.pending items = 0 then
I10: return FAIL M10: for i ← 0 to b.pending items do
I11: if ¬Compare&Swap(cell ptr , NULL, e) then M11: if b.DWV[i] = NULL then
I12: return FAIL M12: Compare&Swap(&b.DWV[i],NULL,BLOCK)
I13: return OK
M13: Compare&Swap(&b.state, VAL, FLS)
M14: if state = FLS then
M15: for i ← 0 to b.pending items do
into the underlying calendar queue (see Algorithm 1). As M16: evt ← b.DWV[i]
a last note, the same fall back path is executed whenever M17: if evt = BLOCK ∧ evt.replica = NULL then
M18: tmp ← clone(evt)
the target DWB is in a state that does not accept new items M19: CRCQ.insert invalid(tmp)
(any state different from INS). M20: Compare&Swap(&evt.replica,NULL,tmp)
M21: validate(evt.replica)
M22: Compare&Swap(&b.state, FLS, MVD)
3.2.2. DWQ migration. The migration protocol (depicted
in Algorithm 3) is the core of our DWQ because it hides
all the complexity of deferred management of insertions.
In fact, we need to provide a non-blocking approach for implemented within the original CRCQ during a calendar
this task, allowing multiple threads to collaborate while resize. It consists of creating an invalid replica of the to-
migrating the same virtual bucket. The migration has three be-migrated event and then inserting such a copy within
main phases, each one corresponding to one of the three the target physical bucket. Clearly, since multiple copies
transitions of the state machine depicted in Figure 2. of the same event can be inserted into the target bucket
The first step consists in ensuring that the to-be-migrated by multiple threads, we need to choose which one has to
bucket becomes stable in terms of enqueued items. In par- be considered as valid and hence be used for extraction—
ticular, we need to ensure that there exists a moment in time skipping this step might lead to multiple extractions of the
such that no new insertion in the target DWB can succeed. same event, violating the priority queue semantic. This is
To reach this goal, we set the current state of the bucket achieved by having each thread trying to promote its copy
as VAL via Compare&Swap, indicating that a validation as a master one by making the original event (the one within
is in progress. This ensures that new upcoming insertions DWQ) point to its replica with a Compare&Swap. Such
cannot succeed in adding a new item into DWB. However, a complex algorithm guarantees non-blocking (wait-free)
some concurrent thread might still try to insert an event. migrations that could not be guaranteed if an item were
Consequently, we simulate that the DWV array is full by ac- migrated by a unique thread—this would make the whole
quiring all the cells with an individual Fetch&Add. Then, DWQ data structure blocking or, even worse, would lead
we publish the old value N (returned by the Fetch&Add to non-ordered extractions. This step is repeated until each
instruction) with an atomic exchange. At this point, we have item has been migrated into the calendar queue. Note that
globally signalled that no items can be inserted in entries of this adopted batch-insertion allows us to perform a single
the array whose index is greater than N . Since concurrent traversal of a physical bucket within the underlying CRCQ,
threads might have not completed their insertions, also in providing amortized cost.
this case we cannot safely migrate items from DWB to the We signal that all the items have been migrated by
calendar queue. In particular, we need to perform a scan setting the state of the virtual bucket as MVD. Even though
of the first N entries of the DWV array and block with a the corresponding DWB cannot be longer used for insertions
Compare&Swap instruction each cell that results as still and migrations, it will be kept connected to the DWQ
empty. This guarantees that a thread that has acquired an until it does not belong to the past, namely the highest
index cannot complete its insertion into the array. After this priority event is held by a virtual bucket corresponding
“validation” phase, we are now sure that the number of to a subsequent interval in the timestamp-based priority
items actually inserted in the to-be-migrated virtual bucket domain. This allows us to avoid corner cases, such as the
cannot change over time. Consequently, we apply another continuous dematerialization and materialization of DWBs,
state transition by setting FLS as the current state of the whose management might hamper the responsiveness of our
DWB via Compare&Swap, starting the second phase of data structure.
the migration algorithm called flush.
This phase consists of migrating items from DWB to 3.2.3. LFDWCQ optimizations. There are three main as-
the physical buckets of the underlying CRCQ. To reach pects of LFDWCQ that have a relevant impact on perfor-
this goal, we applied the same copy-and-validate strategy mance since they might change the locality of memory

63
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

current
H H H H H H H H the hottest one, might reduce the utilization of the DWQ
bucket and makes most of the insertions be performed directly on
hot the underlying calendar queue, losing control on locality.
buckets
Consequently, our proactive migration targets warm buckets,
warm whose distance from the hottest one is enough large to
buckets
avoid conflicts and not too far to reduce DWQ utilization.
cold Since the advancement of the highest priority is fast as
buckets
threads extract events, the beginning and the end of the warm
region is strictly related to the actual level of concurrency
insisting on the data structure. Basing on this consideration,
hot
bucket hotness
cold
we consider the first 2N buckets after the one currently
used for extractions as either hot or warm, where N is the
Figure 3. Visual representation of the bucket hotness, namely the likelihood number of active threads. Since the hot region shifts by one
that it becomes active for extractions along wall-clock time. bucket at a time, buckets near the hottest one have higher
likelihood of being proactively migrated and the utilization
of cold buckets is not hampered.
accesses and the effect of executing read-modify-write in- Whenever a thread detects a conflict during a proac-
structions. The first one is the average number of items tive migration phase, we make just one thread proceed
that belongs to a virtual bucket. This clearly has an effect while the other can fallback by executing its extraction
on the utilization of DWV arrays within DWBs of our from the current bucket. Such a back-off scheme can be
DWQ. In fact, when a DWV is full, threads insert items easily implemented by exploiting the result of an individual
directly into the underlying calendar queue. Consequently, Compare&Swap performed by a thread. In particular, if a
the more virtual buckets are dense, the more threads can rely swap performed by a thread A fails, it means that another
on DWQ to reduce the insertion latency. Moreover, since thread B is working on the same bucket. Thus, we avoid
deferred enqueues are processed in a batch, we can tolerate any additional conflict by making thread A stop running
longer lists (physical buckets) than both blocking and non- the migration protocol and proceed with a classical extrac-
blocking calendar queues. In more detail, we used the same tion from the CRCQ. To further reduce the probability of
approach for bucket sizing adopted in CRCQ, which makes conflicts upon migration phases, buckets of both calendars
the number of events per bucket proportional to the average are assigned to NUMA nodes in a circular fashion and we
number of concurrent threads accessing the data structure. make threads proactively migrate only buckets assigned to
The second key point for optimizing LFDWCQ consists the NUMA nodes on which they are running. The benefits
in controlling the timeline of items’ migrations from DWQ introduced by this simple scheme are two-fold. On the one
to CRCQ. Such an operation is carried out by extractions. hand, we reduce the set of threads that might compete for
On the one hand, since threads have to handle deferred migrating a given bucket, hence the likelihood of conflicts.
work, the contention upon extractions and hence impact of On the other hand, memory requests issued during migra-
conflicts is reduced. On the other hand, migrating items from tions do not propagate towards far NUMA nodes, reducing
DWQ to CRCQ is an operation characterized by a heavy- latency for accessing and updating memory.
weight usage of RMW instructions, which might hamper In order to make such a binding effective, we need to
performance due to their impact on caches. These effects can ensure that the memory buffers used for a virtual bucket
be alleviated if the updated cache lines are unlikely shared. and its events are effectively allocated on the target NUMA
To this aim, we make extractions proactively migrate items node. Ensuring this require from none to small adjustments
by flushing those in a subsequent bucket of the current one, on the underlying memory allocator. In fact, if threads are
namely the virtual bucket currently targeted by extractions. pinned to run on a specific core, no countermeasures have
The idea is ensuring that when a virtual bucket becomes to be taken at all because the OS (e.g. the Linux kernel)
hot, namely it becomes the new target for extractions, it typically allocates memory frames on the NUMA node
is already filled with all its (already migrated) items. This nearest to the core issuing the request and/or first accessing
avoids that multiple threads try to migrate the same events, the just allocated page. Conversely, if thread execution might
share cache lines and increase the pressure on the cache- migrate on different cores and NUMA nodes, a NUMA-
coherency firmware. aware allocator is required to have full control of memory
Virtual buckets that immediately follow the hot one in buffer placement.
the priority domain will be active for extractions soon.
Hence, we can define the hotness of a bucket B as the 4. Experimental evaluation
distance in the timestamp-based priority domain between
B and the hottest one, namely the bucket currently used We have compared the behavior of LFDWCQ with the
for extractions. A visualization of this concept is provided ones of recent implementations of non-blocking calendar
in Figure 3. As suggested before, migrating an already queues, namely the Non-blocking Calendar Queue (NBCQ)
hot bucket increases the likelihood of conflicts. On the presented in [20] and the Conflict-Resilient Calendar Queue
other hand, migrating cold buckets, those far away from (CRCQ) [21]. All the data structures use the Epoch-Based

64
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

5 LFDWCQ 5 LFDWCQ 5 LFDWCQ 5 LFDWCQ


NBCQ NBCQ NBCQ NBCQ
CRCQ CRCQ CRCQ CRCQ
Throughput (Mops)

4 4 4 4

3 3 3 3

2 2 2 2

1 1 1 1

0 0 0 0

0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32
#Threads #Threads #Threads #Threads

(a) 400K events (b) 1M events (c) 2M events (d) 4M events

Figure 4. Average Throughput for different queue sizes and thread counts.

Garbage Collector described in [26]. The used benchmark allows to relief the developer from the burden of choosing
is the well-know Classic HOLD [27] where the queue is the implementation that best fits her/his use case. When
initially pre-populated to reach a target size and then it is the number of active threads oversteps 16, LFDWCQ pro-
stressed out by having multiple threads performing a hold vides up to the 15% performance improvement w.r.t. the
operation, namely an extraction immediately followed by optimum in 3 out of 4 cases, showing that our approach
an insertion. The timestamp increment of the new inserted towards NUMA-awareness is effective also at high levels
event (compared to the last extracted one) is obtained via an of concurrency/contention. In particular, when the queue
exponential distribution with mean equal to 1. Our aim was size is smaller than 4 millions, the trend is still up rising,
to evaluate the performance of the data structure at steady suggesting that, if we could increase the number of cores, the
state, so we ran it for 10 seconds after the pre-population gap between CRCQ and LFDWCQ would likely increase.
phase has completed. The performance metric we used is the Consequently, an improved scalability has emerged as a
average throughput computed over 10 different executions. secondary benefit of our approach. However, when the queue
All the experiments have been carried out on an HP Pro- size is set to 4 millions, we achieve the same performance of
liant server equipped with 4 AMD Opteron 6128 processors the original approach, showing no gain at full concurrency.
running at 2 GHz. Each processor has 8 cores for a total This is because the benefits of batch insertions are reduced
of 32 hardware threads. The machine has 64GB of RAM when the density of the events per virtual bucket increases
arranged in 8 NUMA nodes and runs Debian 9.2. (version too much.
5.4.0 of Linux kernel) as Operating System. All the code of This behavior is more evident when observing the la-
the tested solutions is written in C and compiled with gcc tency of both insertion and extraction routines shown in
9.2.1 with the highest optimization flag (O3). Figure 5 and 6, respectively. The costs per insertion (en-
Figure 4 shows the average throughput (and its standard queue) decreases when the event density increases because
deviation) while running the benchmark with queue size we have more opportunities to exploit DWQ for insertion
ranging from 4 · 105 to 4 · 106 and different thread counts of future events, leading up to a 50% improvement at full
(from 1 to 32). It clearly shows that our approach pays off concurrency and largest queue size. On the other hand, the
and has an improved behavior across all evaluated scenarios. batch migration of items performed by extraction (dequeue)
To make this concept clear consider the trend of NBCQ and invocations alleviates the impact of high contention with
CRCQ while increasing both queue size and concurrency smaller queue size (up to the 33% improvement w.r.t. pure
level. On the one hand, if the thread count is lower than CRCQ). However, the latency becomes larger than the one
16, NBCQ has higher throughput than CRCQ. On the other provided by CRCQ when the queue size increases, compen-
hand, when the concurrency level increases, the roles are sating the gains achieved by the enqueues. This suggests
reversed. This is because, the conflict resiliency provided that our approach can be further extended by removing the
by CRCQ is traded off with latency (a trend well-known in batch migration from the critical path of dequeue executions,
the literature [17]), penalizing performance at lower concur- e.g. resorting to helper threads whose role is just the one
rency/contention. of migrating items from DWQ to the underlying calendar
Thanks to its NUMA-awareness, LFDWCQ allows al- queue. This is a direction we will explore as future work.
leviating the CRCQ inefficiencies with lower thread counts Finally, we analyzed the impact on caches of the differ-
by improving the locality of memory accesses. In fact, the ent calendar queue implementations by monitoring the miss
throughput of LFDWCQ always stands between the one of ratio of LLC accesses. This has been computed by resorting
CRCQ and NBCQ when the number of threads is lower than to Hardware Performance Counter statistics gathered with
16, maintaining a distance from the optimum bounded to LIKWID [28], a well-known suite to access such low-level
15% and reducing the performance loss of CRCQ compared monitors—the hardware events to be sampled have been
to NBCQ by at least 50% and up to 100%. This is extremely chosen according to the formula for LLC miss-ratio given
relevant since the actual concurrency level of applications in [29]. Figure 7 shows that our approach reduces the miss
can vary. Consequently, improving the behavior of non- ratio by 50% independently from the queue size. This shows
blocking calendar queues in a wider range of scenarios that rely on work deferring to insert events in batch is an

65
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

7 7 7 7
LFDWCQ LFDWCQ LFDWCQ LFDWCQ
6 NBCQ 6 NBCQ 6 NBCQ 6 NBCQ
CRCQ
Enqueue latency (us)

CRCQ CRCQ CRCQ


5 5 5 5
4 4 4 4
3 3 3 3
2 2 2 2
1 1 1 1
0 0 0 0
-1 -1 -1 -1
0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32
#Threads #Threads #Threads #Threads

(a) 400K events (b) 1M events (c) 2M events (d) 4M events

Figure 5. Average enqueue latency.

45 45 45 45
LFDWCQ LFDWCQ LFDWCQ LFDWCQ
40 NBCQ 40 NBCQ 40 NBCQ 40 NBCQ
Dequeue latency (us)

CRCQ CRCQ CRCQ CRCQ


35 35 35 35
30 30 30 30
25 25 25 25
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32
#Threads #Threads #Threads #Threads

(a) 400K events (b) 1M events (c) 2M events (d) 4M events

Figure 6. Average dequeue latency.

LFDWCQ LFDWCQ LFDWCQ LFDWCQ


100 NBCQ
100 NBCQ
100 NBCQ
100 NBCQ
CRCQ CRCQ CRCQ CRCQ
LLC miss ratio (%)

80 80 80 80

60 60 60 60

40 40 40 40

20 20 20 20

0 0 0 0
0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32 0 4 8 12 16 20 24 28 32
#Threads #Threads #Threads #Threads

(a) 400K events (b) 1M events (c) 2M events (d) 4M events

Figure 7. Miss ratio LLC.

effective technique to improve locality. our solution. Performance tests with a classical benchmark,
executed on top of an off-the-shelf medium-end machine
5. Conclusions equipped with 32 physical cores and 8 NUMA nodes–
globally entailing 64GB of RAM—have shown how our
The management of event pools plays a central role in solution can provide up to 15% performance boost com-
many applications, including simulation. The recent hard- pared to state-of-the-art event-pool management algorithms
ware trend towards multi/many-core platforms has therefore already suited for multi-core machines.
generated a great interest in having event-pool manage-
ment algorithms capable of providing scalability in face of References
concurrent accesses. However, much less explorations have
been performed in order to keep into account another factor [1] D. W. Wall, “Limits of instruction-level parallelism,” in Proceedings
characterizing modern parallel machines, particularly Non- of the Fourth International Conference on Architectural Support for
Uniform-Memory-Access (NUMA). In this article, we have Programming Languages and Operating Systems, ser. ASPLOS IV.
presented the Lock-Free Deferred-Work Calendar Queue, New York, NY, USA: ACM, 1991, pp. 176–188. [Online]. Available:
http://doi.acm.org/10.1145/106972.106991
an event pool, based on the Calendar-Queue archetype,
which jointly offers scalable thread coordination—via non- [2] W. A. Wulf and S. A. McKee, “Hitting the memory wall:
Implications of the obvious,” SIGARCH Comput. Archit. News,
blocking solutions–and NUMA-awareness. The latter fea- vol. 23, no. 1, pp. 20–24, Mar. 1995. [Online]. Available:
ture has been achieved via an approach that changes the http://doi.acm.org/10.1145/216585.216588
actual locality of operations by threads in the different mem- [3] H. Esmaeilzadeh, E. Blem, R. St. Amant, K. Sankaralingam,
ory banks in the NUMA architecture, an objective that has and D. Burger, “Dark silicon and the end of multicore scaling,”
been reached by incorporating work deferring concepts into in Proceedings of the 38th Annual International Symposium on

66
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Computer Architecture, ser. ISCA 11. New York, NY, USA: [16] F. Quaglia, “A low-overhead constant-time lowest-timestamp-first
Association for Computing Machinery, 2011, p. 365376. [Online]. cpu scheduler for high-performance optimistic simulation
Available: https://doi.org/10.1145/2000064.2000108 platforms,” Simulation Modelling Practice and Theory,
vol. 53, pp. 103 – 122, 2015. [Online]. Available:
[4] M. Herlihy and N. Shavit, “On the nature of progress,” in Proceedings
http://www.sciencedirect.com/science/article/pii/S1569190X15000209
of the 15th International Conference on Principles of Distributed
Systems, ser. OPODIS’11. Berlin, Heidelberg: Springer-Verlag, 2011,
[17] J. Lindén and B. Jonsson, “A skiplist-based concurrent priority queue
pp. 313–328. [Online]. Available: http://dx.doi.org/10.1007/978-3-
with minimal memory contention,” in Principles of Distributed Sys-
642-25873-2 22
tems, R. Baldoni, N. Nisse, and M. van Steen, Eds. Cham: Springer
[5] M. Ianni, R. Marotta, D. Cingolani, A. Pellegrini, and F. Quaglia, International Publishing, 2013, pp. 206–220.
“The ultimate share-everything PDES system,” in Proceedings of the
2018 ACM SIGSIM Conference on Principles of Advanced Discrete [18] S. Gupta and P. A. Wilsey, “Lock-free pending event set management
Simulation, Rome, Italy, May 23-25, 2018, F. Quaglia, A. Pellegrini, in time warp,” in Proceedings of the 2nd ACM SIGSIM Conference on
and G. K. Theodoropoulos, Eds. ACM, 2018, pp. 73–84. [Online]. Principles of Advanced Discrete Simulation, ser. SIGSIM PADS 14.
Available: https://doi.org/10.1145/3200921.3200931 New York, NY, USA: Association for Computing Machinery, 2014, p.
[6] M. Dashti, A. Fedorova, J. Funston, F. Gaud, R. Lachaize, 1526. [Online]. Available: https://doi.org/10.1145/2601381.2601393
B. Lepers, V. Quema, and M. Roth, “Traffic management:
A holistic approach to memory placement on numa systems,” [19] R. Marotta, M. Ianni, A. Pellegrini, and F. Quaglia, “A non-blocking
in Proceedings of the Eighteenth International Conference on priority queue for the pending event set,” in Proceedings of the 9th
Architectural Support for Programming Languages and Operating EAI International Conference on Simulation Tools and Techniques,
Systems, ser. ASPLOS 13. New York, NY, USA: Association ser. SIMUTOOLS16. Brussels, BEL: ICST (Institute for Computer
for Computing Machinery, 2013, p. 381394. [Online]. Available: Sciences, Social-Informatics and Telecommunications Engineering),
https://doi.org/10.1145/2451116.2451157 2016, p. 4655.

[7] L. Tang, J. Mars, X. Zhang, R. Hagmann, R. Hundt, and [20] ——, “A lock-free o(1) event pool and its application to share-
E. Tune, “Optimizing googles warehouse scale computers: The numa everything pdes platforms,” in Proceedings of the 20th International
experience,” in Proceedings of the 2013 IEEE 19th International Symposium on Distributed Simulation and Real-Time Applications,
Symposium on High Performance Computer Architecture (HPCA), ser. DS-RT 16. IEEE Press, 2016, p. 5360. [Online]. Available:
ser. HPCA 13. USA: IEEE Computer Society, 2013, p. 188197. https://doi.org/10.1109/DS-RT.2016.33
[Online]. Available: https://doi.org/10.1109/HPCA.2013.6522318
[8] B. Lepers, V. Quema, and A. Fedorova, “Thread and memory [21] ——, “A conflict-resilient lock-free calendar queue for scalable
placement on NUMA systems: Asymmetry matters,” in 2015 share-everything pdes platforms,” in Proceedings of the 2017 ACM
USENIX Annual Technical Conference (USENIX ATC 15). Santa SIGSIM Conference on Principles of Advanced Discrete Simulation,
Clara, CA: USENIX Association, Jul. 2015, pp. 277–289. [On- ser. SIGSIM-PADS 17. New York, NY, USA: Association
line]. Available: https://www.usenix.org/conference/atc15/technical- for Computing Machinery, 2017, p. 1526. [Online]. Available:
session/presentation/lepers https://doi.org/10.1145/3064911.3064926

[9] A. Pellegrini and F. Quaglia, “NUMA time warp,” in Proceedings [22] D. Alistarh, J. Kopinsky, J. Li, and N. Shavit, “The
of the 3rd ACM Conference on SIGSIM-Principles of Advanced spraylist: A scalable relaxed priority queue,” SIGPLAN Not.,
Discrete Simulation, London, United Kingdom, June 10 - 12, 2015, vol. 50, no. 8, p. 1120, Jan. 2015. [Online]. Available:
S. J. E. Taylor, N. Mustafee, and Y. Son, Eds. ACM, 2015, pp. https://doi.org/10.1145/2858788.2688523
59–70. [Online]. Available: https://doi.org/10.1145/2769458.2769479
[10] I. D. Gennaro, A. Pellegrini, and F. Quaglia, “Os-based NUMA [23] D. Dechev, P. Pirkelbauer, and B. Stroustrup, “Lock-free dynamically
optimization: Tackling the case of truly multi-thread applications resizable arrays,” in Proceedings of the 10th International Conference
with non-partitioned virtual page accesses,” in IEEE/ACM 16th on Principles of Distributed Systems, ser. OPODIS’06. Berlin,
International Symposium on Cluster, Cloud and Grid Computing, Heidelberg: Springer-Verlag, 2006, pp. 142–156. [Online]. Available:
CCGrid 2016, Cartagena, Colombia, May 16-19, 2016. IEEE http://dx.doi.org/10.1007/11945529 11
Computer Society, 2016, pp. 291–300. [Online]. Available:
https://doi.org/10.1109/CCGrid.2016.91 [24] T. L. Harris, “A pragmatic implementation of non-blocking linked-
lists,” in Proceedings of the 15th International Conference
[11] F. Strati, C. Giannoula, D. Siakavaras, G. Goumas, and N. Koziris, on Distributed Computing, ser. DISC ’01. London, UK,
“An adaptive concurrent priority queue for numa architectures,” UK: Springer-Verlag, 2001, pp. 300–314. [Online]. Available:
in Proceedings of the 16th ACM International Conference on http://dl.acm.org/citation.cfm?id=645958.676105
Computing Frontiers, ser. CF 19. New York, NY, USA: Association
for Computing Machinery, 2019, p. 135144. [Online]. Available: [25] M. P. Herlihy and J. M. Wing, “Linearizability: A correctness
https://doi.org/10.1145/3310273.3323164 condition for concurrent objects,” ACM Trans. Program. Lang.
[12] S. Roghanchi, J. Eriksson, and N. Basu, “Ffwd: Delegation is (much) Syst., vol. 12, no. 3, pp. 463–492, Jul. 1990. [Online]. Available:
faster than you think,” in Proceedings of the 26th Symposium on http://doi.acm.org/10.1145/78969.78972
Operating Systems Principles, ser. SOSP 17. New York, NY, USA:
Association for Computing Machinery, 2017, p. 342358. [Online]. [26] K. Fraser, “Practical lock-freedom,” Ph.D. dissertation, University of
Available: https://doi.org/10.1145/3132747.3132771 Cambridge, 2004.
[13] I. Calciu, S. Sen, M. Balakrishnan, and M. K. Aguilera, [27] R. Rönngren and R. Ayani, “A comparative study of parallel and
“Black-box concurrent data structures for numa architectures,” in sequential priority queue algorithms,” ACM Trans. Model. Comput.
Proceedings of the Twenty-Second International Conference on Simul., vol. 7, no. 2, p. 157209, Apr. 1997. [Online]. Available:
Architectural Support for Programming Languages and Operating https://doi.org/10.1145/249204.249205
Systems, ser. ASPLOS 17. New York, NY, USA: Association
for Computing Machinery, 2017, p. 207221. [Online]. Available:
[28] J. Treibig, G. Hager, and G. Wellein, “Likwid: A lightweight
https://doi.org/10.1145/3037697.3037721
performance-oriented tool suite for x86 multicore environments,”
[14] R. Brown, “Calendar queues: A fast 0(1) priority queue in Proceedings of the 2010 39th International Conference
implementation for the simulation event set problem,” Commun. on Parallel Processing Workshops, ser. ICPPW 10. USA:
ACM, vol. 31, no. 10, p. 12201227, Oct. 1988. [Online]. Available: IEEE Computer Society, 2010, p. 207216. [Online]. Available:
https://doi.org/10.1145/63039.63045 https://doi.org/10.1109/ICPPW.2010.38
[15] W. T. Tang, R. S. M. Goh, and I. L.-J. Thng, “Ladder queue: An o(1)
priority queue structure for large-scale discrete event simulation,” [29] P. J. Drongowski and B. D. Center, “Basic performance measurements
ACM Trans. Model. Comput. Simul., vol. 15, no. 3, p. 175204, Jul. for amd athlon 64, amd opteron and amd phenom processors,” AMD
2005. [Online]. Available: https://doi.org/10.1145/1103323.1103324 67 whitepaper, vol. 25, 2008.
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Agent-based Modeling and Simulation for


Emergency Scenarios: A Holistic Approach
Andrea Piccione Alessandro Pellegrini
Sapienza, University of Rome Lockless S.r.l.
piccione@diag.uniroma1.it pellegrini@lockless.it

Abstract—Agent-based Modeling and Simulation is a powerful parts interact in a wider whole. Several approaches have
technique which allows to study the interactions in complex also coupled sophisticated models with neural networks [6],
systems, and allows to explore or even foresee the emergence evolutionary algorithms [7], or other learning techniques in
of more complicated properties or behaviors related to the
interaction among the simpler agents in the environment. In the order to provide the agents with behavioral adaptation, making
context of emergency or crisis scenarios, Agent-based Modeling ABMS even more powerful and realistic.
and Simulation can allow to effectively study emergency plans, ABMS can be regarded as an effective methodology to
with the goal of assessing their viability, also with respect to
the number of possible fatalities. In this paper, we analyze
address the problem of studying the behavior of crowds when
Agent-based Modeling and Simulation for crisis scenarios from emergency situations arise. This is particularly important for
a methodological and empirical point of view, with the goal of large-scale events, which are prone to natural disasters and
identifying what are the behavioral parameters that a model chaos generated by people, which could cause severe threat to
should encompass, in order for the results of the simulation to crowds. Among the possible events which should be subject
be useful for emergency plan assessment and/or compilation. We
also experimentally provide a characterization of the effects of
to careful analysis we can enumerate religious service, sport
such behavioral parameters. events, cultural shows, public demonstrations and marches of
Keywords—Agent-Based Modeling and Simulation, Emergency any sort and kind. In many countries, the organization of these
Simulation, Planning. events must be accompanied by the compilation of ad-hoc
security and evacuation plans, to reduce the risk of accidents
I. I NTRODUCTION and fatalities. When these plans are compiled, it is fundamental
Agent-Based Modeling and Simulation (ABMS) is a power- to identify solutions which allows the crowd to escape from
ful paradigm in which the system is represented by a collection catastrophic events in the shortest possible amount of time
of autonomous decision-making entities (the agents) which are and/or minimize the number of people injured or subject to
set out in an environment [1], [2]. Each agent individually death—in many real-world scenarios, simply following the
assesses the surrounding environment, also taking into account shortest path to an exit might not deliver optimal results. Plans
the presence of other agents, and makes decisions on the basis should also consider the possibility that some security exits are
of a certain set of rules which implement their behavior. Dur- blocked, or that the direction to be followed should change
ing its lifetime, an agent can decide to change its behavior, also during the escape—this could be the case, for example, of
depending on the environment state and interactions with other cascading catastrophic events, such as the collapse of part of
agents. The actions that agents take might also have effects a building due to a fire.
on other agents and/or on the surrounding environment—for In many scenarios, compiling these plans is difficult. Indeed,
example, an agent can produce, consume, or exchange items. only real-world experience based on real accidents (which
ABMS is considered incredibly powerful for multiple ap- involve real people) could provide the required information
plications and real-world business problems for a number of to compile the plans. Of course, this is not viable: real-world
reasons. First of all, the model developer can concentrate on experience or experiments with real people can be too costly,
the design of agents behavior independently of where the dangerous, or might be simply impossible, as in the case of the
agents will act. This significantly simplifies the development compilation of evacuation plans for buildings or architectonic
of complex models, allowing to reach results which could ensembles which are not yet built.
be difficult when relying on more traditional mathematical
methods [3], [4]. Second, the interaction of multiple agents In the case of a catastrophic event, a fundamental aspect
in a system can exhibit complex behavioral patterns [5], able to be taken into account is to consider (especially in large
also to show (or even anticipate) what is commonly referred environments) that people do not immediately become aware
to as emergent behavior. Emergence occurs when an entity of the risk or the occurrence of the event itself. In these circum-
is observed to have properties its parts do not have on their stances, the panic generated by the event could be worsened
own. These properties or behaviors emerge only when the by detrimental behavior due to people observing escaping
crowds, without knowing the reason for it. For this reason,
the law in several countries demands the escape plans to
978-1-7281-7343-6/20/$31.00 2020
c IEEE explicitly consider the presence of police (or other law/security

68
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

enforcement agencies) which should monitor the emergency to the comprehensive work in [14] for a thorough discussion
situation, inform the people by technological means such on the technical aspects related to the deploy of agent-based
as loudspeakers, and/or guide the crowd towards the best- models.
suited security exit. Disregarding the possibility that the crowd From the point of view of ABMS as a methodology to study
ignores the information provided by security agents—we will crowds in the context of evacuations, it has been proven to be
deal with this possibility in Section III—could be a source of an effective way to model and analyze the movements and
ineffectiveness of the plan itself. Additionally, when compiling the behavior of very dense crowds. This approach has been
(or actuating) a security plan, a fundamental question is: “how applied to many diverse scenarios, such as malls, airports, or
many security agents should be used to minimize the number parks. Abdelghany et al. [15] have presented a simulation-
of fatalities, and what is their best-suited position in the optimization modeling framework to study the evacuation of
environment?” large-scale pedestrian facilities with multiple exit gates. In
In this paper, we explore ABMS as a technique to support their work, they couple genetic algorithms and ABMS to
the compilation of these security plans, explicitly accounting generate optimal evacuation plans for hypothetical crowded
for different behavioral aspects which should be considered exhibitions halls. The authors assume that the involved people
when designing the logic behind single agents, so as to capture receive evacuation instructions, which is an important aspect,
in a highly-realistic way emergent behavior of crowds. ABMS but they nevertheless do not take into account the possibil-
has features (autonomy, reactivity, pro-activity, and social ity that security exits become unavailable while the crowd
interaction of the agents) which make this method a natural is evacuating the building. Moreover, they assume that the
choice for scenarios requiring autonomous and adaptive partic- people will follow the provided instructions accurately and
ipating agents [8]. Nevertheless, particular care must be put in unequivocally, which is a strong assumption for real-world
the design of such models. Indeed, one way of modeling for emergency scenarios.
such scenarios is to focus on global flow consideration [9], Wang and Wainer [16] have presented a distributed frame-
or on local interactions only [10]. Structurally, an egress work for modeling evacuation of crowds which models the
scenario can be studied taking into account all the reachable environment in a realistic way starting from CAD/BIM au-
exists, while distributing evenly (in terms of egress time) the thoring tools. This work illustrates the importance of relying
population, as it is typically done in flow control [11], [12]. on realistic environments for real-world models. We consider
Nevertheless, at an individual level, agents are not particles, the environment to be a fundamental aspect in the model-
but social entities [13]. ing methodology, and we discuss how general environments
We define several building blocks of the agents which we should be modeled, although we do not retain the capability
consider fundamental to execute significant ABMS simulations of using authoring tools out of the box.
of evacuation scenarios. We believe that such an analysis Zheng et al. [17] have evaluated different methodologies
could be helpful for people studying the behavior of crowds, to carry out crowd evacuation simulations. The evaluated
and for practitioners which are involved in the development methodologies include cellular automata models, lattice gas
of assistive tools for the compilation of security plans. In models, social force models, fluid-dynamic models, agent-
particular, we consider the modeling methodology presented based models, game theoretic models, and approaches based
here as effective for evacuation simulations in the context of on experiments with animals. The authors conclude that
earthquakes, landslides, floods, fires, terrorism attacks, crazy psychological and physiological elements affecting individual
drivers, shooting, collapses, bombing, panic by misbehaving and collective behaviors should be also incorporated into the
people, or abandoned objects which could be thought to evacuation models, the assessment of which is exactly part of
be bombs, just to mention a few. Anyhow, depending on the characterization which we carry out in this paper.
the specific scenario, fewer aspects of the holistic modeling The importance of aspects such as physiological, emotional,
approach which we propose can be considered, as the behavior and social group attributes has been studied in [18]. This work
of the agents is fully probabilistic. shows that when social group and crowd-related behaviors
We complete our exploration with an experimental charac- are modeled according to findings and theories observed
terization of the effects of the different behavioral aspects and from social psychology, and when the interactions among
parameters on the final results of the simulations. With this individuals is realized by means of agent-based execution
study, we stress the need for a holistic approach in ABMS for processes, it becomes easier to simulate persons awareness of
evacuation scenarios. the situation and consequent changes on the internal attributes,
The remainder of this paper is structured as follows. In and the results are realistic at both individual and group level.
Section II we discuss related work. Our modeling methodology Du et al. [19] have shown that evacuation plans could be
is presented in Section III. The experimental characterization significantly suboptimal if the involved people are signifi-
is reported in Section IV. cantly older that average situations. In their work, they have
shown that older people are often not taken into account
II. R ELATED W ORK with great care also when compiling evacuation plans for
A lot of work has been done on ABMS, especially in the senior apartment buildings. Older people typically have a
context of frameworks and runtime environments to support different behavior in emergency situation as they move slower
their execution on large-scale clusters. We refer the reader and might demand for help [20], and have a higher fall

69
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

probability [21]. Puts et al. [22] have shown that by 2050 the of other people entering the environment. In our approach, the
world population with an age greater than 60 years will be steady state of the crowd distribution can be reached thanks
composed of 22 billion people, and Prot and Clements [23] to mobility models and/or by specifying the initial distribution
have shown that older people are more subject to accidents of the crowd in the environment. There has been an extensive
than other people. All in all, by this body of work, it is quite research on this aspect in the literature, and we refer the
clear that it is not possible to avoid considering age in ABMS reader to the work in [27] for a discussion and a possible
of evacuation plans, as the presence of elderly might also lead methodology with respect to this specific aspect.
to unexpected emergent behavior of the crowd. As mentioned in Section I, we target in our modeling
Chu et al. [24] have shown that egress simulations produce approach several different emergency scenarios. At the same
significantly different results when taking into account differ- time we advocate that, for a reliable assistive tool for the com-
ent agent behavioral models, namely following familiar exits, pilation of evacuation plans, it is important to take into account
following cues from building features, navigating with social the integration of multiple catastrophic events. Therefore, an
groups, and following crowds. Similarly, Zia and Ferscha [25] ABMS model must provide the possibility to consider that,
have shown that it is fundamental to combine individual, social during a single simulation, multiple events occur at different
and technological models of people during evacuation, in order time instants. It is also fundamental to correlate such events.
to obtain results which are close to real-world scenarios. These Therefore, the modeling approach should consider that, given
are aspects which we explicitly retain, while we combine them the occurrence of some event in the environment, correlated
with additional behavioral characteristics. events could take place after a certain amount of time, either
Overall, we consider all the aforementioned aspects in this in a fixed way, or by creating relations which are based on
paper (and additional ones), we try to orchestrate the concepts probability distributions. This is the case, e.g., of parts of the
in a holistic way with respect to the modeling strategy, and building collapsing some time after that an explosion took
we provide an experimental characterization of the effects of place. Another example is that of combined terrorism attacks,
these behavioral parameters on the overall simulation results. which take place shortly one after the other, also while the
III. T HE M ODELING A PPROACH crowd is already escaping. Often, it is extremely hard to make
an analysis of such events when compiling an evacuation
The modeling approach which we propose and study in this plan, giving the high number and stochasticity of variables
paper can be regarded as a tool for analysis, study, and forecast to account for, thus making ABMS a fundamental assistive
of the behavior of crowds in closed or open space environ- methodology.
ments, with a special focus on evacuation in case of crisis
scenarios. The approach is based on ABMS, and we define and Another aspect to account for is the timely intervention
combine the characteristics of each behavioral aspect which of rescuers or police. This is an aspect that also depends
we consider fundamental for a significant simulation able to on the environment. As an example, a catastrophic event
also produce realistic emergent behavior. The ultimate goal of happening at a concert might be more difficult to manage
this modeling approach is to allow for a what-if analysis of for rescuers, as the high-density of the crowd could prevent
the evacuation plans of buildings and/or public events. rescue vehicles to reach the critical points quickly. Also, the
mixture of people and vehicles in the same environment could
A. Representation of the Environment and Management of create more security risks, or increase the level of panic in the
Correlated/Timed Events people attending the event. Additionally, the social behavior
A fundamental aspect for effective ABMS of crowd egress of the people is such that they could seek rescuers, also if
scenarios is to provide a high parameterization and behavioral they do not actually need assistance, thus slowing down the
capabilities at the level of the agents and the environment. intervention, or creating variations in the evacuation flow as
As far as the environment is concerned, it is fundamental to soon as rescuers reach the incident location.
specify an accurate representation of the obstacles that the As already highlighted, the way according to which evac-
agents moving around could find on their way. We consider uation starts can play a fundamental role in the evacuation
traditional grid-based representations to be partially-suited for process. In large environments, different people could be
the purpose. In particular, the work in [26] has shown the informed of the occurrence of an event for which they should
importance in ABMS to rely on a graph-based topology egress. People nearby the accident will likely notice the
to represent more complex environments. In our modeling event by themselves, while people farther away might be
approach, we envisage the reliance on more traditional grid notified by loudspeakers, they could observe part of the crowd
based environments to represent portions of the overall space, running away, or they could be notified “remotely” by some
which are then linked in a graph-like fashion from/to specific kind of gossip dissemination—social networks or messaging
points of the grids. This solution allows to easily represent applications could also play a role here. This kind of remote
multi-level buildings, or areas which can be reached only interaction could also be misinterpreted, driving part of the
from specific entrance points, and provides a good degree of crowd towards the critical place(s) in the environment, rather
flexibility in the configuration of the environment. than in the opposite direction. This is a kind of emergent
Moreover, it is fundamental to be able to specify the initial behavior which could lead to the adoption of different no-
condition for the crowd distribution, and possible source points tification systems in the environment, or which could drive

70
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the selection of the best-suited position of law enforcement b) Age: As already mentioned, age is an always more
officers in the environment, e.g., during some event. important aspect to take into account when compiling security
plans. The age of single individuals could alter the way
B. Behavioral Characteristics of Crowds
according to which they move and orientate in the surrounding
All the aspects which we have discussed so far have a environment. In particular, the speed at which an individual
different effect on the evacuation of the crowd depending on moves in the environment is inversely proportional to its age.
the characteristics of the single person which is involved in A fall probability is also defined depending on the age, which
the evacuation. We advocate that there are some fundamental is exponential with respect to the age.
aspects which must be considered for an evacuation simulation c) Grouping: Studying the emergent behavior of the
to be reliable, and we stress that these aspects cannot be crowd must be done also taking into account that multiple
studied separately from each other. In the following, we individuals might know each other beforehand, and that they
describe the aspects which must be taken into account, when are set in the environment as a group—a simple example is
describing the behavior of an agent in a simulation model. a family, or a group of friends. It is likely that such groups
a) Emotionality and Emotional Contamination: This is a will exhibit a “pack behavior”, in which the interactions with
fundamental aspect to take into account to describe the behav- the environment and the movements happen as a group. These
ior of the individuals, during emergency situations. Anxiety, groups will exhibit a behavior which will try to maximize the
panic attacks, fear, bewilderment, they are all aspects of the probability for the group to stay together, and it is something
personality of an individual which could lead to “erroneous” or that could potentially affect the emergent behavior. Different
dangerous actions, both for the single individual and for the environments can be characterized by a different probability of
community during an evacuation. Emotional attitude should grouping, and this should be explicitly taken into account when
be described and considered, and it must also be combined compiling an evacuation plan. In our approach, the grouping
with environmental aspects which can change the actions probability Pg tells the probability that an agent is grouped
that an individual is performing during the evacuation. We with a nearby agents.
model emotionality as a numerical value which is increased d) Remote Grouping: The wide spread of social net-
taking into account the presence of a number n of people in works and the ubiquitous presence of communication means
the nearby (the concept of crowdedness), the distance from adds the need to account for a different kind of grouping.
the catastrophic event dc , and the observability of the exit In particular, if a group of people entered the environment
point, along with its estimated distance de —if the exit is not together, but later split for any reason, it is likely that if an
observable, we set de = ∞. Each individual is characterized emergency scenario arises, they will try to regroup of get in
by an emotional factor η ∈ [0, 1] which drives the speed contact before leaving the environment. This could clearly
according to which the emotionality value is updated towards create delays in the evacuation, or counter-intuitive behaviors
the critical threshold. Overall, emotionality—which is always (e.g., moving towards the accident point). Again, this is an
in the range [0, 1]—is updated according to Equation (1), aspect which must be taken into account to deliver reliable
which accounts for a very high emotionality ramp up after simulations. In our modeling approach, the remote grouping
the occurrence of the critical event: probability Prg determines whether the agent will stop for a
  n·d e random amount of time after that it becomes aware of the
0 1 dc dc
E =η 1+ + (1 − η)E, (1) emergency situation, or that it will start to move towards the
e n · de other individuals forming its group. The two behaviors are
where e is a control variable which is set to ∞ until the chosen uniformly at random.
occurrence of the catastrophic event, and to 1 afterwards— e) Memory and Knowledge: Different individuals might
it allows to prevent the emotionality value to increase in a have a different knowledge of the surrounding environment,
normal environment. and their memory could play a fundamental role. A motivating
Every time that the emotional value E for an individual example is a person which enters a mall for the first time
overcomes a certain threshold Ē, the agent starts to misbehave. in its life. They do not know where exit locations could be
Misbehavior entails forgetting about its heading towards an found, but they do remember the route they traveled to reach
exit, and starting moving according to a random walk, also a certain position. During the evacuation, also related with
possibly seeking rescuers if they are in the nearby. This their emotional state, they might decide to travel towards the
misbehavior continues until the emotional value is reduced entrance which led them into the building, possibly ignoring
behind the threshold, e.g., thanks to the agent getting closer other factors such as the presence of more suitable exits,
to an exit. or information provided by security officers. Similarly, the
Emotional contamination is also taken into account in this lack of knowledge of the surrounding environment might
computation: two or more individuals which are in proximity lead the decision-making process slower, or it could possibly
could “contaminate each other” with respect to their behav- exacerbate the “herd effect”, in which people simply follow
ior. As an example, if multiple anxious people are gathered other people during the escape. The memory probability Pm
together, without any “leader” or “stronger” individual in and the environmental knowledge probability Pkn determine
proximity, they might generate collective panic crises which how the agents will behave, once they become aware of the
are detrimental to security and safety. occurrence of a critical event.

71
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

f) Knowledge of Environmental Risks: This is another scenarios, especially because they could make an otherwise-
behavioral aspect which is fundamental, especially in cascad- good plan completely ineffective.
ing catastrophic events. A motivating example is an individual By this classification of the aspects which we consider
which, during a fire, moves in proximity of pillars or poles. fundamental in ABMS for evacuation purposes, it is clear
These architectonic elements could be easily damaged by the that some of them partially overlap, either in the cause and/or
fire, and their collapsing might produce additional fatalities. in the effects. This is exactly the reason why we advocate
In this sense, an individual which has a higher knowledge of that a holistic approach towards ABMS in these scenarios
these risks might leave the shortest path to a security exit, should consider all of them at once. In particular, we consider
just to avoid several environmental risks. If the environment that, for each individual, the behavioral description should
is extremely crowded, this behavior could create blockages be based on probability distributions, which feed different
or a slowing down, which are extremely important factors explanatory variables describing the agents. In this sense, an
from an emerging behavior point of view. The probability Per evacuation plan should be considered reliable if and only if
determines whether an agent is aware of environmental risks. it respects some Key Performance Indicator levels under a
In the positive case, the agent will try to avoid all the regions high variability of the agents’ behavioral characterization. Of
in the environment which they consider to be risky to traverse. course, in the context of specific public events (e.g., concerts
g) Trustfulness in Other People and Institutions: This of religious services) some configurations might be excluded.
behavioral aspect describes whether an individual will likely As an example, the distribution of the age of individuals can be
trust other people in the escape (therefore, “joining them” and tailored to the kind of public event. Nevertheless, an assistive
possibly forming a group), or whether they will abide by the ABMS-based tool could extremely simplify the compilation
indications of security officials and/or rescuers. It is possible of a security plan, if the model is able to account for all the
that this aptitude will negatively influence the choices taken, aspects which we have discussed at once.
also in case there is the availability of useful information to C. Behavioral Characteristics of Rescuers
leave the risky environment. The probability Pt tells how
probable is that an agent will trust (and implement) the With respect to rescuers, we consider several different
evacuation plans suggested by surrounding agents, or whether aspects and entities. One general aspect is related to the
it will continue to evacuate according to its own strategy. timeliness of the intervention. In particular, we consider the
possibility that actual rescuers or law enforcement agents
h) Social Networks: They are always more important in require some time to start acting after the catastrophic event.
daily life. Also in emergency situations, it is possible that the Indeed, this modeling approach can account also for the fact
people will spend some time in seeking for information—this that, before intervening, individuals in charge require to coor-
is an aspect also related to the aforementioned grouping aspect. dinate. The configuration of this general aspect works at the
The timeliness and quality of the retrieved information might level of the single individual, because different people might
be argued as well. In particular, inaccurate information might also react according to a different timeliness. The timeliness
make the people make wrong decisions, also in proximity of of intervention is therefore a configuration parameter which
secure points such as emergency exits. At the same time, is driven from a Gaussian distribution. The mean value of
this phenomenon could generate additional delays in the this distribution can be specified at simulation configuration,
evacuation of some people. The Psn probability determines accounting also for the aforementioned delay for coordination.
whether an agent, upon the occurrence of the critical event, Another aspect is related to the possibility, the delay, and
will spend a random amount of time stopped, consulting social the period of repetition according to which all individuals in
networks. the environment are notified of the fact that an emergency is
i) Lack of Understanding or Confusion: During an evac- occurring. This aspect mimics the fact that, as we have already
uation, individuals might not fully understand the information discussed, rescuers could inform all people of an accident by
that is provided to them, also by rescuers, or they might enter means of loudspeakers. Whether the crowd take into account
a confusional state—this is also related to the aforementioned this information or not, depends on the individuals’ state. The
emotional aspect. These states might make the individuals modeling of this aspect is similar in spirit to that of timed
forget or disregard important information related to the correct events which we have discussed before.
path to an exit, which they already acquired. Some people in Rescuers can be classified into people and vehicles. In
a confusional state might take a wrong path in the evacuation, the modeling approach, vehicles are set towards a specific
or they could simply stop, impeding the egress of other place in the environment and they move at a speed which
individuals. The confusion probability Pc determines whether is inversely proportional to the crowdedness of the region.
an agent gets into the confused state. This condition can be Vehicles can be of any kind, as they could represent firefighters
checked multiple times during the simulation. reaching the spot of a fire, or policemen trying to reach the
j) Chaos-generating Individuals: With specific respect to area of a shooting. Their target position is updated during
terrorism attacks, it is possible to suppose the presence of the simulation, every time that their target changes location—
people which generate chaos on purpose. These people will think, again, of shooters. We account for a delay in the
explicitly act against the evacuation of crowds. This kind of notification of the change of their target, which represents
byzantine behavior should be explicitly modeled in complex coordination and/or communication time.

72
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

With respect to people, rescuers have a twofold goal in our


modeling approach. On the one hand, they instruct the people
about the best-suited strategy to leave the environment. In our
modeling approach, rescuers can be regarded as oracles which,
at any time, give the best information. This clearly mimics
the fact that during evacuations there is often a coordination
system which tells the rescuers what are the proper actions
to be done. Again, whether this information is used or not
by individuals evacuating the environment depends on their
current state. At the same time, with respect to emotional
contamination, the presence of rescuers in proximity of in-
dividuals will reduce their anxiety level, therefore reducing
Fig. 1: Reference Emergency Scenario for the Experimental
the possibility that they misbehave.
Assessment.
D. Key Performance Indicators
An aspect which still requires a discussion is how to analyze
In our reference implementation, the different kinds of
the output of an evacuation simulation. In particular, ABMS
agents which we have described before belong to 5 different
could provide the end user with a bulk of data so large
main classes. The first class is called rescuers, and implements
that it might become impossible to drive conclusions on the
the behavior described in Section III-C. We have then catego-
scenario of the simulation. If on the one hand visualization
rized agents as group leaders and group members. This clas-
tools might become useful to interpret graphically the outcome
sification allows to generically represent the aforementioned
of a simulation—we refer again the reader to the work
aspects associated with grouping, remote grouping, and social
in [16]—there are some numerical Key Performance Indicators
networks. We have then devised simple individuals, which are
(KPIs) which could be computed during or at the end of
agents who do not belong to any group. The last class is that of
the simulation, for further interpreting the “goodness” of the
agitators, which encompass agents which are confused, have
simulated scenario. The fundamental KPIs which we envisage
a lack of understanding, or are generating chaos. All the other
for catastrophic events—similar KPIs have been individuated
aspects discussed in Section III-B are captured in terms of
in the work in [28]—are the following:
explanatory variables, characterizing every single agent.
1) Number of individuals evacuated per unit of time: a The environment in which the simulation takes place is
“good” evacuation plan is such that it is able to maxi- depicted in Figure 1. The dark grey lines are obstacles
mize the number of evacuees; (i.e. walls). An agent survives the simulated scenario if it
2) Time to evacuate all people (except for fatalities): a successfully reaches one of green regions (the exits). At
“good” evacuation plan is such that this time is min- startup, the agents are uniformly distributed in the small gray
imized; section. A total number of 10,000 agents has been used in
3) Usage of safest paths to evacuate the people: this is our simulation scenario. Three consecutive explosions take
especially true in the context of cascading catastrophic place in the simulated scenario. Before the first explosion, the
events; agents are free to spread in the environment. At simulation
4) Minimization of the cost to actuate the evacuation plan: time T = 10, the first explosion takes place in the upper
this can be regarded as a multi-objective optimization right part of the map. At simulation time T = 100, a second
problem, in which we want to minimize the number of explosion takes place, right in the agents’ starting area—the
fatalities, while reducing the number of security officers gray area. A third explosion takes place at T = 200 in the
involved in the process. left part of the map. The terrain is partitioned into hexagonal
Of course, the stochastic nature of this kind of simulations cells only for the purpose of partitioning the simulation run
requires a large number of different runs, for each configura- into multiple simulation object. One hexagonal cell has the
tion, to have reliable results. long diagonal set to 300 meters, thus making the size of
the environment non-minimal. The scenario is purposefully
IV. E XPERIMENTAL A NALYSIS disastrous in order to highlight the emerging trend, especially
We have implemented an agent-based simulation model when different simulation parameters are used.
keeping into account all the aspects described in Section III The baseline configuration for the model is as follows. We
and generating all the KPIs for each simulation run1 . The set the average emotionality threshold to Ē = 0.9, the average
model has been implemented on top of the open source ROOT- age to 40 years, the grouping probability to Pg = 0.1, the
Sim Speculative PDES runtime environment [29], and all the remote grouping probability to Prg = 0.005, the average
simulations have been run on a hexa-core Intel i7-9750H CPU, knowledge probability to Pkn = 0.1, the probability to rely
equipped with 16 GB of RAM, running Linux 5.4.0. on memory to Pm = 0.01, the probability of knowledge
of environmental risks to Per = 0 (no actual environmental
1 Our implementation is available at https://github.com/HPDCS/egress. risks are modeled in the scenario), the trustfulness probability

73
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

to Pt = 0.1, the probability to rely on social networks to that the largest part of the agents has no idea about where
Psn = 0, and the probability of confusion to Pc = 0.01. An to go to leave the disaster scenario, and starts wandering
individual is chaos-generating with probability Pch = 0.01. around. Rescuers, on the other hand, try to drive people
Such an individual will randomly walk against the escaping towards the exits, but are anyhow subject to the same behavior
crowd, while also actively trying to scare and confuse the other of the other agents—they tend to move farther from the
agents. explosion points. In this sense, the zero-knowledge agents are
We modify this baseline configuration to study what hap- continuously subject to random movement, until they reach an
pens to a subset of the KPIs which we have introduced when exit by chance. By looking at the adversarial map, this could
one single parameter is changed—for the sake of brevity we require a significant amount of time, and given the subsequent
are not able to report results related to all KPIs in this paper. explosions it can be fatal for a large number of agents.
We have run complete simulation scenarios, meaning that the The most interesting (and possibly unexpected) result is
simulation is halted either when all agents have evacuated the associated with trustfulness (Figure 6). A slight increase in
map, or have died. Each point in the plot is averaged over 5 trustfulness generates an increase in the number of fatalities
different simulation runs—a total of 55 runs for each plot. All and time to evacuate. This is an emergent behavior related
different configurations of the models have been run with the to contrasting information. In this scenario, there is a sub-
same set of 5 random seeds for random-number generators, to set of the agents who have a non-minimal knowledge of
allow for a stabler comparison. We present results associated the environment, and are already heading towards a known
with the variation of the percentage of fatalities over the exit on the map. If these agents are also associated with a
total number of individuals in the simulations, the number of high trustfulness, while heading towards the exit, they might
individuals evacuated per unit of time, and time to evacuate change their plan and start following indications from the
all people. These results are studied when varying different rescuers—this entails heading towards a different exit. Given
parameters, namely age (Figure 2), the confusion probability the adversarial map, the time to evacuate increases, and it also
(Figure 3), the grouping probability (Figure 4), the knowledge creates conglomerates of agents who slow down the stampede
of the environment (Figure 5), trustfulness (Figure 6), and the of others. When the trustfulness is increased to a higher extent,
presence of chaos-generating individuals (Figure 7). the egress becomes more organized, and the agents can reach
Experimental data show that there is a positive correlation exits in a more ordered way. It is interesting to note that it
between the fatalities rate and the individuals’ average age is required a factor of trustfulness set to 100% to obtain a
(Figure 2). This is expected, since a higher age reduces reduction in fatalities of 50%.
mobility. However, the mortality doesn’t increase dramatically. Chaos generating individuals are a vast minority of the agent
The standard deviation of the age distribution has been set to population. Also, they do not directly restrict the movement of
15 years in order to reduce the noise over multiple simulation the population. In this extreme simulated scenario, the effect
runs. This means that rescuers and grouped agents don’t often of chaos generating individuals is therefore insignificant—
lose sight of each other due to vast difference in mobility. almost 40% mortality rate with default settings. Nevertheless,
This effect is also witnessed in the egress per time unit and a minimal increase in the time to evacuate can be observed.
the time to evacuate. The former decreases as the average age To conclude the experimental assessment, we report that,
increases, while the latter increases linearly. on average, the simulation of a complete scenario requires 55
seconds of wall-clock time.
Grouping (Figure 4) has a negligible effect on the number
of fatalities in our simulation scenario. As explained earlier, V. C ONCLUSIONS AND F UTURE W ORK
age variability is limited, therefore grouped agents don’t often In this paper we have discussed a holistic approach with
lose sight of each other. Moreover, in our simplified model, an respect to ABMS for emergency management, with a spe-
agent that loses sight of his group simply tries to escape alone. cial focus on crowds escaping environments in the case of
This does not directly impact survival chance. Confusion cascading catastrophic events. We have shown experimentally
probability (Figure 3), on the other hand, is strongly correlated what is the effect of the different behavioral characteristics of
with mortality rate. It is noted that all conditions impacting individuals in the overall results of the simulation. Our results
agents mobility have a huge influence on the outcome of this confirm that it is important to consider multiple aspects at
simulated scenario. This phenomenon is also reflected on the once, because the outcome of the simulations could lead to
egress per time unit, which drops almost to zero for higher very different results.
probabilities, and is similarly observed in the time to evacuate, In our future work we plan to perform an in-vitro recon-
which grows exponentially. This is also an expected result, struction of real-world accidents from the past. This effort
because if a large fraction of the agents are confused, they will allow us to determine whether and to what extent our
start misbehaving, moving themselves farther from exits. modeling approach is able to recreate actual evacuations in
The effect of knowledge on the environment is interesting real-world accident situations. Moreover, we plan to integrate
(Figure 5), because it illustrate a predominant behavioral our model into a framework which will allow to automatize
effect. A zero-knowledge probability generates a significantly the exploration of different parameters given a configuration,
high number of fatalities, and increases drastically the time so as to determine what could be the best-suited characteristics
to evacuate. This is a result which is associated with the fact of a final evacuation plan.

74
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

100 40 1200
35
1000
80

Saved per time unit


30
Fatalities (%)

Logical time
800
60 25
20 600
40 15
400
10
20
200
5
0 0 0
30 35 40 45 50 55 60 65 70 30 35 40 45 50 55 60 65 70 30 35 40 45 50 55 60 65 70
Average Age (years) Average Age (years) Average Age (years)

(a) Fatalities (b) Egress per time unit (c) Time to evacuate
Fig. 2: Effects of Average Age

100 40 1200
35
1000
80
Saved per time unit

30
Fatalities (%)

Logical time
800
60 25
20 600
40 15
400
10
20
200
5
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Probability Probability Probability

(a) Fatalities (b) Egress per time unit (c) Time to evacuate
Fig. 3: Effects of Confusion Probability

100 40 1200
35
1000
80
Saved per time unit

30
Fatalities (%)

Logical time

800
60 25
20 600
40 15
400
10
20
200
5
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability Probability Probability

(a) Fatalities (b) Egress per time unit (c) Time to evacuate
Fig. 4: Effects of Grouping Probability

100 40 1200
35
1000
80
Saved per time unit

30
Fatalities (%)

800
Logical time

60 25
20 600
40 15
400
10
20
200
5
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability Probability Probability

(a) Fatalities (b) Egress per time unit (c) Time to evacuate
Fig. 5: Effects of Knowledge of the Environment

75
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

100 40 1200
35
1000
80

Saved per time unit


30
Fatalities (%)

Logical time
800
60 25
20 600
40 15
400
10
20
200
5
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability Probability Probability

(a) Fatalities (b) Egress per time unit (c) Time to evacuate
Fig. 6: Effects of Trustfulness

100 40 1200
35
1000
80

Saved per time unit


30
Fatalities (%)

Logical time
800
60 25
20 600
40 15
400
10
20
200
5
0 0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Probability Probability Probability

(a) Fatalities (b) Egress per time unit (c) Time to evacuate
Fig. 7: Effects of Chaos-Generating Individuals

R EFERENCES [16] S. Wang and G. Wainer, “A simulation as a service methodology


with application for crowd modeling, simulation and visualization,”
[1] E. Bonabeau, “Agent-based modeling: Methods and techniques for SIMULATION, 2015.
simulating human systems,” Proceedings of the National Academy of [17] X. Zheng, T. Zhong, and M. Liu, “Modeling crowd evacuation of
Sciences, 2002. a building based on seven methodological approaches,” Building and
[2] C. M. Macal and M. J. North, “Tutorial on Agent-Based Modeling and Environment, 2009.
Simulation,” Proceedings of 2005 Winter Simulation Conference, 2005. [18] L. Luo et al., “Agent-based human behavior modeling for crowd
[3] J. Epstein and R. L. Axtell, Growing Artificial Societies: Social Science simulation,” in Computer Animation and Virtual Worlds, 2008.
from the Bottom Up. MIT Press, 1997. [19] X. Du, Y. Chen, A. Bouferguene, and M. Al-Hussein, “Multi-agent based
[4] A. M. Colman, “The complexity of cooperation: Agent-based models of simulation of elderly egress process and fall accident in senior apartment
competition and collaboration,” Complexity, 1998. buildings,” in Proceedings - Winter Simulation Conference, 2019.
[5] C. W. Reynolds, “Flocks, herds and schools: A distributed behavioral [20] J. Lord, B. Meacham, A. Moore, R. Fahy, and G. Proulx, “Guide for
model,” ACM SIGGRAPH Computer Graphics, vol. 21, no. 4, pp. 25–34, evaluating the predictive capabilities of computer egress models,” NIST
aug 1987. GCR, 2005.
[6] N. Gilbert and P. Terna, “How to build and use agent-based models in [21] F. Sharifi et al., “Predicting risk of the fall among aged adult residents
social science,” Mind & Society, vol. 1, no. 1, pp. 57–72, mar 2000. of a nursing home,” Archives of Gerontology and Geriatrics, 2015.
[7] S. Mabu, K. Hirasawa, and J. Hu, “A Graph-based Evolutionary Al- [22] M. T. Puts et al., “Meeting the needs of the aging population: The
gorithm: Genetic Network Programming (GNP) and its Extension sing canadian network on aging and cancer,” Current Oncology, 2017.
Reinforcement Learning.” Evolutionary computation, vol. 15, no. 3, pp. [23] E. Y. Prot and B. Clements, “Preparedness in Long-Term Care: A Novel
369–98, sep 2007. Approach to Address Gaps in Evacuation Tracking,” 2017.
[8] R. Liu, D. Jiang, and L. Shi, “Agent-based simulation of alternative [24] M. L. Chu, P. Parigi, J. C. Latombe, and K. H. Law, “Simulating effects
classroom evacuation scenarios,” Frontiers of Architectural Research, of signage, groups, and crowds on emergent evacuation patterns,” AI
vol. 5, no. 1, pp. 111–125, mar 2016. and Society, 2015.
[9] T. J. Cova and J. P. Johnson, “A network flow model for lane- [25] K. Zia and A. Ferscha, “An Agent-Based Model of Crowd Evacuation,”
based evacuation routing,” Transportation Research Part A: Policy and in Proceedings of the 2020 ACM SIGSIM Conference on Principles of
Practice, vol. 37, no. 7, pp. 579–604, aug 2003. Advanced Discrete Simulation, ser. PADS. New York, NY, USA: ACM,
[10] W. Yuan and K. H. Tan, “An evacuation model using cellular automata,” jun 2020, pp. 129–140.
Physica A: Statistical Mechanics and its Applications, vol. 384, no. 2, [26] A. Piccione, M. Principe, A. Pellegrini, and F. Quaglia, “An Agent-
pp. 549–566, oct 2007. Based Simulation API for Speculative PDES Runtime Environments,”
[11] D. Helbing, I. Farkas, and T. Vicsek, “Simulating dynamical features of in Proceedings of the 2019 ACM SIGSIM Conference on Principles of
escape panic,” Nature, vol. 407, no. 6803, pp. 487–490, sep 2000. Advanced Discrete Simulation, ser. PADS. New York, New York, USA:
[12] D. Helbing, A. Johansson, and H. Z. Al-Abideen, “Dynamics of crowd ACM Press, 2019, pp. 83–94.
disasters: An empirical study,” Physical Review E, vol. 75, no. 4, p. [27] W. Li, Z. Di, and J. M. Allbeck, “Crowd distribution and location
046109, apr 2007. preference,” in Computer Animation and Virtual Worlds, 2012.
[13] A. Schadschneider et al., “Evacuation Dynamics: Empirical Results, [28] W. Zhang et al., “Agent-Based Modeling of a Stadium Evacuation in a
Modeling and Applications,” in Extreme Environmental Events, 2011. Smart City,” in Proceedings of the 2018 Winter Simulation Conference,
[14] S. Abar, G. K. Theodoropoulos, P. Lemarinier, and G. M. O’Hare, A. Rabe et al., Eds. IEEE Press, dec 2018, pp. 2803–2814.
“Agent Based Modelling and Simulation tools: A review of the state- [29] A. Pellegrini and F. Quaglia, “The ROme OpTimistic Simulator: A
of-art software,” Computer Science Review, vol. 24, pp. 13–33, may tutorial,” in Proceedings of the Euro-Par 2013: Parallel Processing
2017. Workshops, ser. PADABS, D. an Mey et al., Eds. LNCS, Springer-
[15] A. Abdelghany, K. Abdelghany, H. Mahmassani, and W. Alhalabi, Verlag, 2014, pp. 501–512.
“Modeling framework for optimal evacuation of large-scale crowded
pedestrian facilities,” European Journal of Operational Research, 2014.

76
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Collision Avoidance Proposal in a MEC based


VANET environment
Nicolas Nevigato Mauro Tropea Floriano De Rango
Dimes Department Dimes Department Dimes Department
University of Calabria University of Calabria University of Calabria
Rende, Italy Rende, Italy Rende, Italy
Email: nicolas.nevigato@libero.it Email: mtropea@dimes.unical.it Email: derango@dimes.unical.it

Abstract—Mobile Edge Computing (MEC) is a new network


paradigm that allows resource management and IT services at the
edge of a communication network and, so, closer to the devices
guaranteeing low latency and high bandwidth requirements.
This characteristic makes MEC paradigm suitable for critical
communication services used in collaboration with container-
based virtualization and with 5G networks. In this paper, an
implementation of a collision avoidance system based on MEC
in a VANET environment is proposed. This system makes use
of cloud and edge computing and it is able to switch communi-
cation from edge to cloud server and vice versa when possible,
trying to guarantee the required constraints and balancing the
communication among the servers avoiding of overloading edge
layer. The simulation results have proved how, in some cases, the
MEC-5G combination is the best solution for avoiding collisions
in a VANET environment. Fig. 1: Communication in vehicular networks
Index Terms—VANET, Collision Avoidance, Mobile Edge Com-
puting (MEC), 5G

I. I NTRODUCTION so on [4]–[6]. For the nature of the context, in the VANET


network the aspect of low-latency communication plays a key
In the last years, Vehicular Ad-Hoc Networks (VANETs)
role. Then, the support of 5G paradigm can give a big help,
and all Intelligent Transportation System (ITS) technologies
having it the ability of guaranteeing best latency and reliability
are always more object of researchers’ studies giving their
in highly mobile and densely connected scenarios. Moreover,
importance in the improvement on safety and security on
the Mobile Edge Computing (MEC) network architecture [7],
the anywhere roads. The big progress in different aspects
able to operate the processing, storage and network capabilities
of the automotive industry are a proof of the importance
closer to end devices, is another important factor to be taken
of this research [1]. Recently, Internet of Vehicles (IoVs)
into account in vehicular networks. It can be implemented on
concept [2] has been introduced for individuating the possible
RSUs and, supporting V2I communication, it is able to meets
communication between cars and all other objects present
the requirements of low latency and high bandwidth.
in the vehicular scenario, such as RoadSide Units (RSUs),
Many real applications use MEC technology in collabora-
pedestrian devices, edge or cloud servers [3]. This means that
tion with the latest generation networks for exploiting their
vehicles are aware of their surroundings thanks to different
benefits. A possible use in the automotive sector regards the
sensors integrated inside. Different types of communication is
driver’s safety, through assisted driving systems to improve the
possible to individuate in such scenario as it is possible to view
reliability, efficiency and quality of road transport, for exam-
in Fig.1: the vehicles are able to exchange information with
ple, by predicting potential collisions anticipating detection of
other vehicles (V2V communication), with road infrastructure
a road hazard, see Fig.2. A big help in the realization of a such
(V2I communication), with pedestrians (V2P communication)
system is represented by virtualization with Docker containers
and with communication networks (V2N communication). All
and the container orchestrator Kubernetes [8].
together provide the so-called Vehicle-to-Everything commu-
nication (V2X communication), that is, communication from The rest of this paper is organized as follows: Section
vehicle to everything. The vehicular network has different II presents a brief overview on the vehicular environment;
tasks such as safety of the roads, efficiency of the traffic, Section III describes the system considered in the work;
energy saving, emission reduction, autonomous vehicles and in Section IV, a description of mechanisms proposed for
avoiding collisions is provided; numerical results are presented
978-1-7281-7343-6/20/$31.00 ©2020 IEEE in Section V, and finally, Section VI concludes the paper.

77
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

In [20] the authors provide a survey on a vehicular edge


computing (VEC) that uses the concept of VANET and MEC
paradigm.
Other works deal with advising of dangerous situations in
VANET and how to resolve emergency situations through the
help of on board sensors such as in [21] where the attention
is focused on the design of a new approach for vehicular
environments able to gather information during mobile node
trips, for advising dangerous or emergency situations by ex-
ploiting on-board sensors. It is assumed that each vehicle has
an integrated on-board unit composed of several sensors GPS
Fig. 2: Potential collision event in the system device, able to spread alerting messages around the network,
regarding warning and dangerous situations/conditions and it is
assumed that each vehicle can communicate with on roadside
II. R ELATED W ORK devices called RSU able to spread warning messages. In this
way, if an accident occurs, the arriving cars will, probably,
The research on ITS and VANET is full of papers deal- avoid delay and danger situations.
ing with different aspects typical of this context. Many of
these works concern routing aspects with different types III. S YSTEM D ESCRIPTION
of approaches such as the new opportunistic one [9], [10] In Fig.2 it is depicted the system implemented in order to
or genetic approaches in multicast context [11], [12]. They simulate a scenario where a possible collision can be happen
concern VANET and Cloud computing, security issues, routing between two vehicles. In order to prevent potential collisions
protocols, link prediction for routing purpose, virtualization the use of a MEC paradigm is implemented. The server MEC
and so. Many research papers deal with the new paradigm is integrated in a RSU and it plays a fundamental role together
called Internet of Vehicles (IoVs), such as in [13] where the with 5G technology for avoiding collisions in a ITS system.
authors present a survey on IoV highlighting architectural and
applications’ aspects.
Collision avoidance issues are faced in many works. For
example, in [14] a system based on the RoadSide Unit (RSU)
is proposed, where RSUs collect data from the events that
occurred and disseminate data and make decision on the
vehicles and they are able to help the ambulance vehicle
to avoid the collided path and to navigate in next quickest
route. In [15] a method for predicting future vehicles positions
is proposed using a new framework based on Convolutional
Neural Network (CNN) called YOLOv3, an algorithm used for
computer vision. The authors propose this algorithm for pro-
viding a system able to realize a collision detection/avoidance
mechanism for IoV networks based on the knowledge of the
future vehicles position.
Security aspects are faced in many other papers. For ex- Fig. 3: Considered scenario
ample in [16] the authors present an overview on different
VANET security aspects showing the main attacks and coun- The simulated system has been realized through the use of
termeasure techniques. They present a comparative analysis normal PCs that simulate the RSU units running on Docker
and provide the research direction of the community for container [22] managed by Kubernetes [23] and the vehicular
the future. In [17] the main investigated aspect regards the traffic by using SUMO simulator [24], see Fig.3. Finally,
information safety, considered one of the most critical issue in the implementation of the MEC server and, therefore, the
VANET. The authors present the problematic of confidential- communication from server to vehicle has been created. The
ity, privacy, and authenticity as the main challenges of VANET final system is able to simulate a traffic of vehicle that
system discussing also different classes of VANET attacks. communicate between them and with the server in order to
Virtualization and operation with cloud/edge paradigm are manage a collision avoiding mechanism and, so guaranteeing
object of many studies such as [18] where the authors propose security on the vehicular environment. Vehicles periodically,
a VANET system based on a Software-Defined Network by a device installed on board, send messages that contain
(SDN) architecture supporting network virtualization; or [19] information about their status towards the RSU MEC server.
where the authors show the concept of vehicular cloud com- In particular, the message contains the identification number
puting as marge of VANET and cloud computing paradigm. (ID), the position and speed of the vehicle in question.

78
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

The messages exchanged between vehicles and RSUs are model. This real world road network can be easily down-
processed and used to understand when an emergency braking loaded and modified and imported into the simulator through
is taking place, and therefore anticipate any collisions. In the netconvert command. In this way, a realistic vehicular
case of emergency event, the MEC server generates an alert traffic in the urban environment is generated. The Eclipse
message, and immediately sends it to the following vehicles development environment is used to manage the simulator
within a certain range. When the alert message reaches the using code written in Java language and the library called
vehicles, automatically the braking system is activated to stop SumoTraciConnection [24]. In order to start the simulation,
the vehicle safely and avoid a collision. A dynamic switching a xml configuration file with the .config extension has been
algorithm has also been implemented that allows the vehicle created containing: the file related to the road route, the file
to instantly decide whether to send the message: to the MEC related to vehicles and an additional file to display an alert on
or Cloud server, in order to avoid of overloading the devices the map in case of a collision between two vehicles. Moreover,
on the roadside when it is not necessary. the configuration file contains a time section where the start
and the end of the simulation with steps are specified. It is also
A. Docker and Kubernetes possible to generate an output file with the simulation logs.
The MEC server integrated in the RSU units is realized To carry out the communication between vehicles and MEC
using virtualization mechanism. A container is an isolated server and between vehicles and Cloud server, the CoAP
environment sharing the same kernel of the operating system. protocol [25] was chosen using the Californium library written
Docker is an open source project that automates the imple- in Java [26]. Each vehicle was created as a CoAP client, and
mentation of applications within software containers provid- periodically, makes requests to the MEC or Cloud server acting
ing additional abstraction thanks to the virtualization at the as CoAP server.
operating system level of Linux [8], [22]. It uses the resource
isolation features of the Linux kernel. Docker implements C. Critical parameters design
high-level APIs to manage containers that run processes in Preliminary studies have been conduced for analyzing la-
isolated environments. Since it uses Linux kernel functionality tencies and reaction times necessary to avoid collisions among
(mainly cgroup and namespace), a Docker container, unlike a vehicles through the considered scenario.
virtual machine, does not include a separate operating system. The vehicle that is ahead has ID = 0, while the vehicle
Instead, it uses kernel functionality and leverages resource that follows has ID = 1, as it is possible to view in Fig.2.
isolation (CPU, memory, block I/O, network) and separate For simplicity, in mathematical formulation, these IDs are
namespaces to isolate application from the operating system. put as superscripts. Two vehicles travelling a stretch of road
It uses the concept of image, that includes the fundamentals at constant speed have the same deceleration capacity a and
of the operating system created by Dockerfile script. maintain a distance d from each other. At a given instant, the
Kubernetes is an open source container orchestration and vehicle with ID = 0 brakes sharply. Let tcr be the sum of the
management system [23]. It is based on different components latency and the reaction time necessary for the vehicle with
distinguished in master and nodes. The master is the main ID = 1 to start braking. The hourly law with starting time
element and the other nodes refer to master for coordinating t0 = 0 for the vehicle with ID = 0 is as follows:
themselves. The node, called also worker, has the task of
(
executing the work load following the operative modalities x00 − 21 at2 + v00 t t < av
defined by the master. A group of workers is called cluster. (v 0 )2 (1)
x00 + 12 a0 t ≥ av
The resource describing the elementary unit executable on a
cluster node is called Pod. Kubernetes guarantees reliability as The hourly law with starting time t0 = 0 for the vehicle
it can automatically restart containers that fail during execution with ID = 1 is as follows:
and terminate those that do not respond, always guaranteeing (
a certain number of containers in execution. x10 + v01 t t < tcr
(2)
x10 + v01 + atcr t − 12 a t2 + t2cr
 
B. Vehicular traffic management with Sumo t ≥ tcr
The vehicular traffic management is made by Sumo sim- Let suppose that due to braking vehicle with ID = 0
ulator where a stretch of road of SS107 Silana-Crotonese, stopped its run, it is possible to calculate the instant of collision
in the southern Italy, has been considered and some xml between vehicle ID = 0 and vehicle ID = 1 that follows,
files have been created containing all information regarding with the assumption of considering the two paths traveled by
the road portion: the IDs of the two lanes of the roadway the vehicles equal.
with the information of their direction, the maximum speed
in m/s and so on; the vehicles travelling information with:
1 (v00 )2 1
acceleration and deceleration values, ID, length, maximum x00 + = x10 + v01 + atcr t − a t2 + t2cr
 
(3)
speed in m/s and minimum gap with the previous vehicle, 2 a 2
and so on. Sumo allows to import the road network via Open By calculations, the following second degree equation are
Street Map, simplifying the process of developing the mobility obtained:

79
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

1 2 1 1 (v00 )2
at − v01 + atcr t + at2cr + d +

=0 (4)
2 2 2 a
that has real solutions for ∆ > 0:

∆=
2
v01 + atcr +
1

 1 2 0 2
1 (v0 )
 (5)
−4 2 a 2 atcr + d + 2 a >0
d
=⇒ tcr > v

Therefore, a collision between two vehicles occurs when tcr


is greater than the ratio between distance d and speed v of the
considered vehicles.

D. Communication between Vehicles and Edge/Cloud Server


The implemented system follows a client-server structure
with CoAP communication. The communication between ve-
hicle and MEC or Cloud server follows the CoAP paradigm:
a CoAP client is able to send information to the server
through the POST method of the Californium library [26]. The
information inside of a message regards the vehicle status such
as ID, speed and position. The Sumo simulator connected to
Eclipse is used to manage client-side communication, that is,
it manages the vehicles that are travelling in the map. This
communication can be only with a MEC server, only with a
Cloud server or characterized by a dynamic switching between
MEC and Cloud server.
Briefly, the behavior of the individual vehicle is shown in
the following flow chart, Fig.4:
Fig.5 also shows a timing diagram showing the interactions
between the travelling vehicles and servers. It can be seen that Fig. 4: Flow chart of the behavior of a vehicle
depending on the technology used, the request and response
times are different. In particular, the longer the communication
latency time, the greater the delay with which the vehicle will
be notified by the server.

IV. P ROPOSED M ECHANISMS


A. Dynamic switching Edge-Cloud
The aim of the proposed mechanism is a dynamic switching
algorithm able to not overload the edge server by switching
messages to the cloud server when it is possible. In particular,
this choice is performed by vehicles that can decide to Fig. 5: Time diagram of interaction between servers and
communicate with edge or cloud server on the basis of the vehicle
priority of the communication. This decision depends on the
current vehicle traffic conditions with the task of keeping the
collisions number as low as possible. It is performed on the For each incoming message the vehicle must check whether
basis of communication delay, and in particular, on the basis of it has received warnings from both the Cloud and the MEC
latency and reaction time; remember that the reaction time has server. Therefore, two CoapHandlers procedures were needed
to be greater than the ratio between distance between the two to manage both communications asynchronously.
vehicles and their speed (supposed to be the same). If this ratio
is lesser than the total time between the latency of the Cloud
B. Server MEC
and the reaction of the vehicle, it means that communication
to the Edge is necessary otherwise there would be a collision The server MEC is supposed to be implemented in a RSU
choosing the Cloud. Otherwise, communication with the Cloud unit on the roadside. On the basis of the information from ve-
is chosen. The algorithm is described by flow chart in Fig.6. hicles, it evaluates if a danger event or a emergency braking is

80
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

has been created where a Pod, characterized by the single


container, is declared. Finally, a service has been created to
make the Pod accessible.

V. P ERFORMANCE ANALYSIS
In this section, simulation results are given in order to show
the performance of the system. Firs of all, a description of the
considered simulative scenario is presented. It is composed of
the following elements: four vehicles at distance d among [6;
6.5; 7; 7.5; 8; 8.5; 9; 9.5] values and constant speed v of 50
km/h, same deceleration and reaction time t random in the
interval [450,500] ms [27]. Twenty runs have been carried out
for each configuration: MEC and Cloud server communication
in 4G and 5G technologies. In particular, the specific latency
times for each mobile technology have been used. In table I
the simulation parameters are summarized.
Firstly, the collisions percentage varying the distance d
Fig. 6: Flow chart relating to dynamic switching between vehicles has been assessed. A communication with
MEC and Cloud Server, considering 4G and 5G technologies,
has been considered.
happened, and then, it immediately notifies following vehicles
to avoid potential collisions TABLE I: Simulation Parameters
Everything is summarized in the handlePOST method of the
Parameter Value
MEC server which manages every single communication with
vehicles number 4
the client. The server stores and decodes vehicles message vehicle distance d 6, 6.5, 7, 7.5,
information in order to evaluate if vehicle is braking abruptly. 8, 8.5, 9, 9.5
In particular, a dangerous event is identified if the difference vehicle speed v 50 km/h
vehicle deceleration a same for each vehicle
between two consecutive vehicle speeds is greater than a fixed reaction time t [450,500] ms
threshold. In this case, the server, identifying all the vehicles server range action r 15-50 m
arriving within a certain range, alert them in time for avoiding
collisions. The operating criteria are described briefly in Fig.7. The results obtained are shown in the following graphics,
see Fig.8.
As it is possible to observe, with same vehicles distance,
collisions percentage using communication with Edge is lower
than communication with Cloud, considering both mobile
technologies 4G and 5G. It is possible to note that, in some
cases, the number of collisions, in the hypothesis of same ve-
hicles distance, decreases. For example observing the distance
equal to 6.5 meter, using communication with MEC server first
in 4G and then in 5G, the percentage of collisions decreases
from 33,3% to 18,3%. After, the collisions percentage between
vehicles was evaluated, no longer changing their distance,
but varying the action range of the server from the point of
the hazard event. The considered scenario is the same of the
previous experiments with an action range of the server equal
to r meters from the point where the hazard event. The first
vehicle at a certain moment is forced to brake suddenly. The
considered ranges are: 15, 20, 26 meters. These values have
been chosen respectively to ensure that the server warning is
sent to the following vehicle, to the two following and finally
Fig. 7: Flow chart relating to the behavior of the MEC server to all three following vehicles (of one braking suddenly). The
results obtained are those shown in the Fig.9. As it is possible
Once the Java Server code was realized, a Docker container to see, both using 4G and 5G technology, and considering
could be created. Then, this container was published on a communication with the Cloud and the MEC server, as the
private repository created in the Docker Hub registry so that it range of action of the server increases, the percentage of
can be managed with Kubernetes. In Kubernetes, a deployment collisions decreases. This is due because a vehicle warned in

81
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

(a) (b)
Fig. 8: Percentage of Collisions with 4G and 5G network varying vehicles distance.

(a) (b)
Fig. 9: Percentage of Collisions with 4G and 5G network varying server range action.

time of the danger will start braking earlier avoiding collision VI. CONCLUSIONS
with the previous vehicle. Therefore, the more vehicles are In this work, a collision avoidance mechanism for automa-
warned by the server, the lower is the collision rate. tive environment has been proposed based on the Edge Com-
puting paradigm. In particular, an assisted guidance system
Another parameter that has been evaluated is the MEC for collision prediction based on MEC technology has been
and Cloud server utilization percentage in dynamic switching. developed. The use of Edge is essential when constraints
This algorithm allows, especially in certain vehicular traffic on latency are required. With Edge technology the system
conditions, not to overload the Edge communicating with the functionalities are moved closer to end users, and then to the
Cloud server. Three scenarios are considered in the evaluation, edge of the network. The MEC and Cloud server run on a
starting from one that is not congested up to the congested one, Docker container managed by the Kubernetes orchestrator. In
as shown in the Figg.10a, 10b and 10c. order to show the improvements of Edge Computing over the
Cloud, a comparison evaluating the percentage of vehicles’
The results are represented in the graphic of Fig.11. collisions has been conduced showing a significantly reduction
of collisions number through the use of MEC paradigm due
When the traffic is congested, vehicles communicate more to lower latency values. It has also been demonstrated by
with the MEC server than with the Cloud server since low experiments how in some cases the MEC-5G combination has
latency times are needed to avoid collisions. So, in this case the best performance in the considered system.
a lot usage of the Edge device is registered. If the traffic is
R EFERENCES
lightly congested, the server utilization percentage is almost
the same. Considering instead traffic not congested, each [1] F. Giust, V. Sciancalepore, D. Sabella, M. C. Filippou, S. Mangiante,
W. Featherstone, and D. Munaretto, “Multi-access edge computing: The
vehicle communicates only with the Cloud server since latency driver behind the wheel of 5g-connected cars,” IEEE Communications
constraints are not stringent. Standards Magazine, vol. 2, no. 3, pp. 66–73, 2018.

82
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

[6] F. De Rango, M. Tropea, P. Raimondo, and A. F. Santamaria, “Grey wolf


optimization in vanet to manage platooning of future autonomous elec-
trical vehicles,” in 2020 IEEE 17th Annual Consumer Communications
& Networking Conference (CCNC). IEEE, 2020, pp. 1–2.
[7] P. Porambage, J. Okwuibe, M. Liyanage, M. Ylianttila, and T. Taleb,
“Survey on multi-access edge computing for internet of things realiza-
tion,” IEEE Communications Surveys & Tutorials, vol. 20, no. 4, pp.
2961–2991, 2018.
[8] D. Bernstein, “Containers and cloud: From lxc to docker to kubernetes,”
IEEE Cloud Computing, vol. 1, no. 3, pp. 81–84, 2014.
[9] A. Socievole, F. De Rango, and C. Coscarella, “Routing approaches and
(a) Uncongested vehicular traffic performance evaluation in delay tolerant networks,” in 2011 Wireless
Telecommunications Symposium (WTS). IEEE, 2011, pp. 1–6.
[10] A. Socievole, E. Yoneki, F. De Rango, and J. Crowcroft, “Opportunistic
message routing using multi-layer social networks,” in Proceedings of
the 2nd ACM workshop on High performance mobile opportunistic
systems, 2013, pp. 39–46.
[11] F. De Rango, M. Tropea, A. F. Santamaria, and S. Marano, “An enhanced
qos cbt multicast routing protocol based on genetic algorithm in a hybrid
hap–satellite system,” Computer Communications, vol. 30, no. 16, pp.
3126–3143, 2007.
[12] ——, “Multicast qos core-based tree routing protocol and genetic
algorithm over an hap-satellite architecture,” IEEE Transactions on
(b) Low traffic congestion Vehicular Technology, vol. 58, no. 8, pp. 4447–4461, 2009.
[13] B. Ji, X. Zhang, S. Mumtaz, C. Han, C. Li, H. Wen, and D. Wang, “Sur-
vey on the internet of vehicles: Network architectures and applications,”
IEEE Communications Standards Magazine, vol. 4, no. 1, pp. 34–41,
2020.
[14] G. Rakesh and M. M. Belwal, “Vehicle collision avoidance in a vanet
environment by data communication,” in 2019 3rd International Con-
ference on Computing Methodologies and Communication (ICCMC).
IEEE, 2019, pp. 238–242.
[15] C.-C. Chang, C.-A. Lai, and W.-M. Lin, “Iov-based collision avoidance
by using confidence region,” in 2019 IEEE Eurasia Conference on IOT,
Communication and Engineering (ECICE). IEEE, 2019, pp. 32–35.
[16] A. Kumar, M. Bansal et al., “A review on vanet security attacks and
(c) Congested vehicular traffic their countermeasure,” in 2017 4th International Conference on Signal
Processing, Computing and Control (ISPCC). IEEE, 2017, pp. 580–
Fig. 10: Three different scenarios of congestion. 585.
[17] R. Kaur, T. P. Singh, and V. Khajuria, “Security issues in vehicular ad-
hoc network (vanet),” in 2018 2nd International Conference on Trends
in Electronics and Informatics (ICOEI). IEEE, 2018, pp. 884–889.
[18] A. Bhatia, K. Haribabu, K. Gupta, and A. Sahu, “Realization of
flexible and scalable vanets through sdn and virtualization,” in 2018
International Conference on Information Networking (ICOIN). IEEE,
2018, pp. 280–282.
[19] R. Hussain, J. Son, H. Eun, S. Kim, and H. Oh, “Rethinking vehicular
communications: Merging vanet with cloud computing,” in 4th IEEE
International Conference on Cloud Computing Technology and Science
Proceedings. IEEE, 2012, pp. 606–609.
[20] L. Liu, C. Chen, Q. Pei, S. Maharjan, and Y. Zhang, “Vehicular edge
computing and networking: A survey,” arXiv preprint arXiv:1908.06849,
2019.
[21] A. F. Santamaria, M. Tropea, P. Fazio, and F. De Rango, “Managing
emergency situations in vanet through heterogeneous technologies co-
operation,” Sensors, vol. 18, no. 5, p. 1461, 2018.
[22] “Docker: Empowering app development for developers,”
https://www.docker.com/, 2019.
Fig. 11: Percentage of server usage in switching [23] “Kubernetes: Production-grade container orchestration,”
https://kubernetes.io/, 2019.
[24] “Sumotraciconnection,” https://sumo.dlr.de/javadoc/traas/it/polito/appeal/
traci/SumoTraciConnection.html, 2019.
[2] F. Yang, S. Wang, J. Li, Z. Liu, and Q. Sun, “An overview of internet [25] K. Puangnak, W. Puisamlee, K. Puangnak, and N. Rachsiriwatcharabul,
of vehicles,” China communications, vol. 11, no. 10, pp. 1–15, 2014. “Evaluation of mqtt and coap for vehicle traffic monitoring,” in 2019
[3] S. Sharma et al., “Vehicular ad-hoc network: An overview,” in 2019 16th International Conference on Electrical Engineering/Electronics,
International Conference on Computing, Communication, and Intelligent Computer, Telecommunications and Information Technology (ECTI-
Systems (ICCCIS). IEEE, 2019, pp. 131–134. CON). IEEE, 2019, pp. 915–918.
[4] A. F. Santamaria, P. Fazio, P. Raimondo, M. Tropea, and F. De Rango, [26] “Eclipse californium (cf) coap framework,”
“A new distributed predictive congestion aware re-routing algorithm for https://www.eclipse.org/californium/, 2019.
co 2 emissions reduction,” IEEE Transactions on Vehicular Technology, [27] M. Makridis, K. Mattas, D. Borio, R. Giuliani, and B. Ciuffo, “Esti-
vol. 68, no. 5, pp. 4419–4433, 2019. mating reaction time in adaptive cruise control system,” in 2018 IEEE
[5] P. Fazio, M. Tropea, and S. Marano, “Node re-routing and congestion Intelligent Vehicles Symposium (IV). IEEE, 2018, pp. 1312–1317.
reduction scheme for wireless vehicular networks,” Wireless Personal
Communications, vol. 96, no. 4, pp. 5203–5219, 2017.

83
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

A Novel Deep Reinforcement Learning based service


migration model for Mobile Edge Computing
Sung Woon Park Azzedine Boukerche Shichao Guan
School of Electrical Engineering and School of Electrical Engineering and School of Electrical Engineering and
Computer Science Computer Science Computer Science
University of Ottawa University of Ottawa University of Ottawa
Ottawa, Canada Ottawa, Canada Ottawa, Canada
spark163@uottawa.ca boukerch@site.uottawa.ca sguan049@uottawa.ca

Abstract—Cloud Computing has emerged as a foundation of user mobility can lead to the deterioration of Quality of Service
smart environments by encapsulating and virtualizing the (QoS). The service migration issue which involves the
underlying design and implementation details. Concerning the coordination among geographically distributed MEC servers is
inherent latency and deployment issues, Mobile Edge Computing one of the core issues to be concerned regarding the quantity,
seeks to migrate services in the vicinity of mobile users. However, mobility patterns of the users, and the heterogeneity of edge
the current migration-based studies lack the consideration of servers [6].
migration cost, transaction cost, and energy consumption on the
system-level with discussion on the impact of personalized user The service migration in MEC is a sophisticated
mobility. In this paper, we implement an enhanced service optimization problem, since the decision on whether, when, and
migration model to address user proximity issues. We formalize where to migrate relies on many dynamic environmental
the migration cost, transaction cost, energy consumption related variables, involving user mobility, communication channel
to the migration process. We model the service migration issue as characteristics, and resource availability [7]. In addition to the
a complex optimization problem and adapt Deep Reinforcement complexity of the input parameters, the service migration in
Learning to approximate the optimal policy. We compare the MEC inevitably introduces service delays -- Transmission
performance of the proposed model with the recent Q-learning Delay, Processing Delay, and Backhaul Delay [8]. Several
method and other baselines. The results demonstrate that the studies have conducted to solve the migration problem between
proposed model can estimate the optimal policy with complicated MEC servers, focusing on costs related to migrations. Although
computation requirements.
notable results were yielded from them, these solutions have not
Keywords—Mobile edge computing, service migration, deep fully considered the impact of personalized user mobility, which
reinforcement learning, energy consumption, migration cost may not properly function in such complex scenarios. For
instance, if the user equipment follows some mobility patterns
I. INTRODUCTION and moves on the boundaries of two adjacent edge servers, the
policy of some existing distance-based methods can cause
The advanced mobile devices, considered as one of the most
repeated unnecessary migrations(always find the nearest server),
innovative technologies, have brought significant convenience
which degrade energy efficiency and QoS due to downtime or
to human’s everyday life. Such smart devices tremendously
communication channel management overhead incurred by the
boost the developments of many high-tech concepts, such as
migrations [9].
Smart City, Autonomous Vehicle, Internet of Things (IoT), and
have constantly introduced. The Cloud Computing, as a In this paper, to cope with the complex service migration
fundamental infrastructure for implementing the afore- environment in MEC, we proposed an extensive service
mentioned technologies, draws increasing attention from migration model based on Deep Reinforcement Learning
academia and industry due to its elastic provisioning of (DRL). Compared to the previous models, the extensive service
computational and network capabilities [1]. One step further, migration model enables the controller to manage the migration
Mobile Edge Computing (MEC) has been proposed to provide process with a comprehensive perspective, which considers
cloud services in proximity to mobile user equipment, tackling more environmental factors, including migration and transaction
the potential overhead caused by the physical distance between cost, and energy consumption. As utilizing the more variables
UE and cloud instances. Academia and industry have presented and the increased number of MEC servers, we can implement
several conceptual models and proposals in connection with the realistic simulation, although it brings the computation
MEC to resolve related issues. Numerous offloading and complexity. Therefore, we apply DRL to enhance computational
resource management models have been introduced [2,3], and power as well as to increase the likelihood of adopting more
even various platforms for distributed simulation in a distributed determinants. Based on the model, we also formulate a novel
environment have been designed and implemented [4,5]. optimization problem to find an optimal policy of task
Despite of the continuous advancements in MEC environments, migrations between MEC servers, such that the balanced service
numerous challenges are still needed to resolve. The constraints migration decisions are accomplished.
of both the limited computation capability and the unpredictable

978-1-7281-7343-6/20/$31.00 ©2020 IEEE


84
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

The main contributions of this paper are as follows: migration framework for active service applicants using the
Incremental file synchronization. The model is composed tree
▪ We design an extensive service migration model that layers, such as the base layer, the application layer, and the
considers the migration and transaction cost, QoS and instance layer, and the three-layered setup enables the model to
the energy efficiency of the mobile devices and the reduce the service downtime. Ouyang et al. [16] devised an
migration loads on the servers. The system model online service placement framework under a long-term time-
performs the service migration according to the average budget. The authors applied Lyapunov optimization to
optimization policy related to the migration and decompose the long-term budget into a real-time problem. They
transition costs, and the energy consumption. also utilized both Markov approximation and Best Response
▪ We employ Deep Reinforcement Learning to Update methods to achieve the best optimization for the
approximate the complicated computation problem. problem.
▪ We implement meaningful simulations to demonstrate In recent, some researchers applied Artificial Intelligence
that the proposed method surpasses the other models Technique to enhance capacity of their optimization models. T.
related to the service migration. G. Rodrigues et al. [8] provided an analytical model of service
delay in MEC as utilizing a configuration phase to control the
The reminder of this paper is organized as follows: Section service delay elements. They analyzed the behavior of the
presents the related research works in MEC area, especially fitness function using Particle Swarm Optimization (PSO)
the service migration. In section , we describe our enhanced algorithm. As a result, the model can be attained the outcome in
service model for the optimization problem of the service fewer iterations and with fewer particles. On the other hand,
migration. Section formulates the algorithm based on Deep Reinforcement Learning is also used to find solutions for service
Q-Networks. In section , we demonstrate the efficiency of the migration issues. Z. Gao et al. in [17] not only designed a Q-
model as analyzing the simulation results. Lastly, section learning based model to handle the complex environmental
concludes the research. factors, but also utilized a Deep Reinforcement Learning (DRL)
II. RELATED WORK to cope with a complicated computation problem for an action
value Q. Similarly, responding to User Equipment with high
Distributing resources optimally over selected edges is the level mobility and changing mobility pattern, C. Zhang and Z.
crucial task for resource management, which involves Zheng in [18] devised a method based on the deep reinforcement
optimization problems such as minimize processing latency, learning. They made the best use of Deep Q Network (DQN) to
load balancing, and maximize server computing efficiency. help the FMC controller generalize the past experiences. The
Besides resource placement issues, attention regarding authors also defined the reward as trade-off between QoS and
preserving energy is increasing to observe and appropriately the migration cost. Unlike MDP, the proposed DQN based task
control the burden of energy consumption [9]. Several migration algorithm is conducted without transition probability
researches have been involved in Mobile Edge Computing and reward function.
(MEC) area. As a result, diverse MEC concepts, such as Small
Cell Cloud (SCC), Mobile Micro Cloud (MMC), Fast Moving Apart from researches related to service migration costs,
Personal Cloud, Follow Me Cloud (FMC), and CONCERT, several studies are involved in the energy efficiency problem in
have been proposed [10]. There are some other studies related MEC. J. Hu et al. [19] introduced a dynamic service migration
to the service migration model based on Markov Decision model using the optimal stopping theory, which is an effective
Process (MDP). The authors in [11] not only demonstrated the tool to solve optimization problems. They considered the energy
existence of an optimal threshold policy for finding the optimal consumption factor associated with the migration distance to
action of the MDP, but also devised the polynomial time- solve the migration path selection problem. In the paper [20], the
complexity algorithm to determine the optimal thresholds. authors formulated a Green-Oriented Problem (GOP), an energy
Unlike previous works, which presented one-dimension model, minimization problem, and attempted to solve it as
S. Wang et al. in [12,13] provided the way to use the distance- implementing a Mixed Integer Linear Program (MINLP) with a
based MDP to approximate the solution for 2-D mobility models Q-learning-based Reinforcement Learning. The article [21]
in order to efficiently compute a service migration policy. introduced the fine-grained migration based on generation
Besides, they also show how to apply the algorithms to the real algorithm (FGMBGA), a genetic based algorithm, to reduce the
world as adopting it the real mobility traces of taxis in San energy consumption of terminal mobile devices as well as to
Francisco. satisfy the smooth execution of tasks during the task migration
process. Although they devised the optimal migration strategy,
X. Sun and N. Ansari proposed a PRofIt Maximization they only consider migrations between the smart mobile
Avatar pLacement (PRIMAL) strategy in order to optimize the terminal and the MEC, instead of those between the MECs. In
trade-off between the migration gain and cost [14]. They used [22], a task-centric offloading model is devised to solve the
the Mixed-Integer Quadratic Programming (MIQP) tool based a significant communication overhead and the offloading energy
heuristic algorithm to solve PRIMAL problem. A. Nadembega consumption both incurred by a number of offloading requests.
et al. in [15] proposed a mobility-based services migration The task offloading can be organized according to the priority
prediction (MSMP) model to select an optimal micro data center of local task interest measured by constant tracing the task
(MDC). The model estimates the throughput of the user and the execution.
time for MDC service area hand-offs. They demonstrated that
the proposed model surfaces the latest other approaches, in The purpose of our proposal is to present the optimal service
terms of data latency. The paper [7] presented a layered migration model on MEC environment, which is considering

85
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE I. SUMMARY OF RELATED WORKS

Problem Model Methodology Target Issue Evaluation Parameter

Three Layer Model [7] Layering Active Service Migration Migration time, Down time
Reconfiguring cloudlets [8] PSO Cloudlet Activation Processing Delay, Transmission Delay, Backhaul Delay
Threshold Policy-Based Mechanism [11] MDP Service Migration Running Time, Discounted Sum Cost
Distance Based MDP [12,13] MDP Service Migration Computation time, Discounted Sum Cost
PRIMAL [14] MIQP Avatar Migration E2e Delay, Migration Overhead
MSMP [15] DAMP Service Migration Data Latency
Lyapunov Optimization
Mobility-Aware Online Framework [16] Markov Approximation Service Migration User-perceived Latency, Migration Cost
Best Response Update
DRL Based Model [17] DQN Service Migration Migration Cost, Communication Cost
DQN Based Model [18] DQN Service Migration QoS, Migration Cost
DSM [19] Optimal Stopping Theory Migration Path Selection Energy Consumption, Migration Distance
MINLP [20] Q-learning Energy Minimization Energy Consumption
FGMBGA [21] Genetic Algorithm Energy Minimization Energy Consumption
Proposed Model DQN Service Migration Migration Cost, Transaction Cost, Energy Consumption

more factors, such as the migration and transaction cost, QoS


and the energy efficiency, affecting the migration process than
the existing models. With the optimization, we can minimize the
cost of the migration and transaction as well as the energy
consumption to achieve the maximum sufficient for the service
migration. A lot of experimental attempts were conducted by
researchers, and some of them have achieved remarkable
achievements with specific factors. To obtain realistic results for
task migration, we, however, need to consider a comprehensive
model, which increases scalability as well as considers more
environmental conditions. Consequently, we propose an
expanded model, which utilizes not only the DQN as an
approximator but also Q-value function to estimate the optimal
policy.
III. SERVICE MIGRATION MODEL Fig. 1. Interworked distributed cloud/mobile networks architecture.
We consider a time-slotted model, as ∈  = 1, … , , represents the distance between the user and the MEC
based on the FMC concepts, which has the advantages of server at timeslot t, such that =‖ − ‖. The state
applying the proposed agent control model since the FMC after the action is denoted as ′ , then ′ is the
controller exists for managing distributed Data Center instances new service location, where ‖ − ′ ‖=| − ′ |,
(i.e. MEC servers). The FMC controller decides whether, when, and =‖ − ′ ‖ . The total costs consist of the
and where to migrate the services to enable UEs to satisfy QoS. migration cost, the transition cost both between the connected
The MECs locations are denoted as ∈ = 1, … , that the MEC and the service located MEC and between the user and the
locations in are 2-dimension vectors and the distance between connected MEC, which are denoted as ,
and is calculated as the norm value of them, such as respectively; the expressions are as follows:
‖ − ‖ . Also, we define the user as and the service
locations as at the timeslot respectively. Fig. 1 shows the 0, = 0
FMC concept, whose main idea is services are following users = 1
[23]. , 0
A. The migration and the transaction Cost function 0, = 0
We assume that the whole service is migrated when the , = 2
action occurs at each timeslot, and we utilize the constant-plus- , 0
exponential cost functions from [12, 13]. The function involves
both the migration and transition cost, and its usefulness as an Where , , , , , and are real-valued parameters, =
approximator was experimentally demonstrated. Note that state | − ′ | for and , and = ′ for .

86
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

The parameters and set the weight of each distance to the ▪ Reward: The agent obtains the reward according to the
costs, and their values can be influenced with the network action and the state. In this paper, we organize the
topology and routing mechanism of the network. Besides, the object function as the cost and energy consumption.
parameters and control the costs proportionally. Aside from this, we define the reward function as the
differentiation between an adequate large QoS value as
The expression of the one-timeslot cost is given by (3). I and the object function. With the equations (3) and
(4), it is designed as follows:

= | − | | − ′ | ′ 3 = − ∙ 5

B. Energy Consumption function where, is a coordination factor to adjust the scale


We consider energy consumption on servers related to the difference between the cost and the energy
migration between MEC servers. Even it is not a significant consumption values.
problem since the server is always been plugging, we can use it
as a control criterion to cope with the frequent migrations. The
authors in the paper [19] formulated the dynamic migration
energy consumption and demonstrated their improvement. The
method measures the energy consumption during the service
migration, as considering content’s size, noise, and the capacity
of transmission. Hence, we adopt the energy consumption
formula for the service migration between MEC servers, which
is given by (4). Fig. 2. The agent–environment interaction.

A. Q-value function

∙ 2 −1 ∙ | − |
= 4 The Q-value function represents the maximum value of the
∙ discounted total future rewards when the actions a are executed
in state s. Regarding the Bellman equation, the Q function is the
Where , , , , denotes the service content size, the
summation of the optimal reward at the current state and the
noise power, the transmission rate, the bandwidth, and the
maximum future reward at the next state, which is given by (6).
channel attenuation coefficient respectively. We further employ
the degree of service unexecuted ∈ 0%, 100% , which is
, = , 6
presented in the article [24], to approximate the running state of
the service to be migrated.
With the Q-value function, the agent selects an optimal
IV. PROBLEM SOLUTION action for a certain state according to the policy, which is
In this section, based on the afore-mentioned service represented as follows:
migration model, we formulate the DQN algorithm, which
includes not only the Q-learning function but also the DQN = , 7
algorithm. The RL method allows the agent to deliver better
estimations for the Q-value. The target problem of Q-learning Q-learning can estimate the optimal policy as calculating the
is expressed in the MDP. In the RL algorithm, the agent can Q-value function repeatedly. The process of the Q-learning is
make a certain action when it encounters a specific situation in simply denoted as bellows:
the environment. The action results in a reward and a new state,
where the agent can perform another action. With the recursive ,
process, the agent obtains the optimal policy as determining the ⟵ , , − , 8
action, which maximizes the cumulative rewards (shown as Fig.
2) [26]. B. Deep Q-network
We define the necessary factors (state, action, and reward) In practice, it is unrealistic to apply the Q-value function
as follows: because of the size of the input, i.e. the number of the states. In
▪ State: We define the state as the distance between the other words, as increasing the targeted states, it requires more
UE and the MEC server in which the service is located. computation time and power to find the optimal value by
The state is denoted as =‖ − ‖ at tracking all the situations caused by possible actions in a
specific state, even it is impossible to complete. Hence, it is
timeslot t.
necessary to implement the approximation for the estimation of
▪ Action: The agent can move the state to the the Q-value [27]. We utilize the DQN, which includes the Deep
possible ′ as taking the action . The action Neural Network (DNN) stage, in this case, Convolutional
consists of the action set 0, 1 , which means whether Neural Network (CNN), as the approximation as well as
to migrate or not. implements with the experience replay method. We take the
states as input parameters and conduct the feed-forward and the

87
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

back-propagation processes to obtain corresponding output Q- Always migration, and Q-learning algorithm (as discussed later
values (shown as Fig. 3). in Performance of Model analysis). We conduct the simulation
with Python version 3.6 with Tensorflow API and implement
the extensive service migration model and the environment
model.

Fig. 3. Architecture of deep Q-network

With the experience replay, we can prevent to drive the DNN


into the local minimum or to diverge unexpectedly. We also
adopt the off-policy control so that we separate the policy
update to the behavior policy update and the target policy
update. It enables the model to learn the optimal policy while
following exploratory policy. During the process, samples of
experience are drawn uniformly at random from the pool =
,…, of stored samples, which are the agent’s experiences A. Parameter setting
= , , , at each step t. As conducting the update Table II presents the details of the parameters, which are
process with the loss function in Formula (9), we achieve the used for the simulation. The values for the reward function are
optimal policy for the service migration [28]. adopted from the constant values from [12, 13] and the energy
function values from [19, 20, 25], which are logically proven
and experimentally verified. We use = −0.4, =
= , ; − , ; 9
0.4, and = 1.03 for the transaction cost function parameters
, , , ~
to emphasize the impacts according to the fluctuations in the
distances between the user and the serviced MEC server.
Algorithm 1 describes the entire steps of the DQN process. Furthermore, related to the migration cost, we utilize = 0.8,
The process starts with initializing the replay memory and the ∈ −2, 0 , and + = 1. Regarding energy consumption,
action Q-Network with random weight . The target Q- we also evaluate the interaction of the service content size using
Network, which is a separate network to prohibit leading to different values from 2 to 20.
faltering and divergence of the policy, is also initialized as same
value as . During the learning loop, according to an ε-greedy TABLE II. SIMULATION PARAMETER VALUES
policy, actions are selected and executed, which results in the
reward and the next state. The occurred experiences at each step Parameters Description Value
are stored in the dataset D, which is randomly exploited for Q-
Parameters related to the constant-
function value update. This approach allows us not only to , , , plus-exponential cost function
-0.4, 0.4, 0.8, 1.03
achieve better data efficiency but also to obstacle correlations
between samples. ρ Coordination factor 1 10

V. PERFORMANCE EVALUATION Service content size 2 ~ 20

In this section, we discuss numerical analysis to Noise power 3


demonstrate the effectiveness of the proposed model. For the
Transmission rate 2
simulation, we assume that the user follows the random walk
mobility model and MEC Servers are randomly deployed in a Bandwidth 100
100 100 grid area to implement the generic simulation
environment. To validate the improvement of the proposal, we Channel attenuation coefficient 0.2
compare the model with other algorithms: No migration,

88
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

B. Performance of Model analysis the large gaps with the other two algorithms on the reward sum
In the experiments, we train the extensive service migration value.
model (ESM) based on DQN with random user movement 4850
patterns to achieve the minimization of the total sum of the cost
and the energy consumption related to the migration process. 4750
The trained model can select the appropriate action with the
environment state according to the optimal policy. Under the no 4650

Total Reward Sum


migration algorithm, the user starts and ends with the service
on the same MEC sever such that the agent never lets the 4550
service migrated. The reward value of it is only affected by the
transaction costs related to the distances between the user, the 4450
service located MEC, and the connected MEC. When it comes ESM

to the Always migration algorithm, the agent allows the service 4350
Q-learning
Always Migration
to move to the closest MEC server as the user moves around the No Migration
area. As a result, it can reduce the transaction cost, but increases 4250

migration costs and energy consumptions. In terms of Q- 10 15 20 25


Number of MEC
30 35 40

learning algorithm, the agent follows the policy based on Q-


function and Q-table. It repeatedly estimates Q-value and Fig. 4. Total reward sum regarding the number of MEC when − =
updates Q-table to find the optimal policy. The reward values and =
based on the algorithm is close to the ESM results, however, it 4850
has a limitation of the number of states since it uses Q-table that
has restricted index values. Namely, it is unsuitable for the real 4800

circumstance, which has more state conditions or larger areas.


4750
Fig. 4 shows the total reward sums of the experimental Total Reward Sum
group with the variation in the number of MEC servers. Our 4700
ESM
proposed migration model can achieve the highest reward sum Q-learning
4650
value with each number of the MEC serves when comparing Always Migration
No Migration
the others. We cannot observe a constant change trend in the 4600
results because we conduct the experiments using the fixed area.
As the number of servers increases in the fixed area, the density 4550

of servers decreases, so the distance between servers tends to 4500


increase, which reduces migration costs and transaction costs 0 -0.5 -1 -1.5 -2

related to distances while increasing the frequency of service Migration cost parameter −

migration. On the other hand, reducing the number of servers Fig. 5. Total reward sum regarding the migration cost parameter − when
grows the distance between each server, but reduces the number . = and =
of migration target, causing in diminishing migration cost and
energy consumption. Due to the correlation of these A similar result is observed in the third experiment set,
inclinations, the reward sum values do not present any tendency which examines the influence of the content size on the reward
regardless of the number of servers. The results indicate that the values (shown as Fig. 6). With a low content size, ESM, Q-
policy generated by the ESM model can appropriately choose learning, and Always Migration can achieve better rewards
optimal actions by interacting with the occurred states. regarding migration energy consumption. In particular, Always
Migration seems to be more affected by the parameter.
C. Evaluation for Impact of variables Compared to the second experiment result of it, the algorithm
In the second experiment, we consider the variation of the has overall improved outcomes. Apart from that, ESM can
total reward sum when the migration cost parameter − achieve the best reward sum no matter how much the content
changes from 0 to 2 under the fixed number of MEC as 30 and size is. It indicates that the model is properly trained to obtain
the content size of 20. The migration cost function the optimal policy under the variations of the environmental
follows the exponential function curve since it has parameters factors.
that satisfy + = 1. Therefore, as − increases within 0 to D. Evaluation for Combination of DQN variables
2, the results are gradually raised, and eventually converge to In the last experiment, we verify the impacts of the
specific values. composition on DQN as observing the results from the
As shown in Fig. 5, the results from ESM, Q-learning, and networks that have different hidden layers or policy update
Always Migration, which involve the service migration, present methods. We consider three different hidden layer compositions,
a slight decrease according to − value is increased, while No i.e. 1 hidden layer, 2 hidden layers, and 3 hidden layers. As
shown in Fig. 7, both the total reward value and the loss value
Migration displays no fluctuation because it has no impacts on
are shown as the same trend even though some values result in
the migration. Although they also have trivial shifts on the
slight deviations. Therefore, there is no impact related to the
result according to the − value, ESM and Q-learning obtain difference in the hidden layers in our experiments.

89
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

behavior to prevent the local optimization. On the other hand,


4850
the on-policy shows a large fluctuation at the beginning and
4800 more attempts are needed to minimize the loss value. Hence,
the off-policy update method that we select gives opportunities
4750 to get better achievements when the model is applied to the real
Total Reward Sum

environment.
4700
ESM

4650
Q-learning VI. CONCLUSION
Always Migration
No Migration In this paper, we proposed an extensive service migration
4600
model based on DRL to handle the service migration problem.
4550
In the proposed model, DQN is utilized to solve the inevitable
computation problem, arising out of the process of designing a
4500 service migration model that considers more factors occurring
2 5 10 15 20
Service content size
in the real environment. As demonstrating that the high-
complexity computations can be performed using DQN as an
Fig. 6. Total reward sum regarding the service content size when approximator, we suggest the possibility to realize the realistic
. = and − =
service migration scenario. In other words, we could derive the
In terms of the policy update methods, we compare the off- optimal policy under various determinants. Besides, we
policy, which the proposed mode uses, and the on-policy as a formulate the reward function to balance the trade-off between
control group. Fig. 8 describes the results of both the total service migration and energy consumption. We convince that
reward value and the loss value from the alternate methods. the policy derived by the formulation can be applied in a
There is a notable difference in loss results between those two practical way.
methods, while there is no special observation regarding the
total reward values. The off-policy method stably minimizes the On the next step, we are planning to design an agent to
loss value because it divides the policy to the target and the control MEC servers connected with multiple users as adopting

5000 5000

4900
4900

4800
4800
Total Reward Sum

Total Reward Sum

4700
4700

4600

4600
4500

4500
4400
1 Hidden Layer
Off-policy
2 Hidden Layers 4400
4300
On-policy
3 Hidden Layers
4200 4300
Epoch Epoch
Total reward sum Total reward sum

350

4000
300 1 Hidden Layer Off-policy
2 Hidden Layers 3500 On-policy
250 3 Hidden Layers
3000
Loss

Loss

200 2500

2000
150

1500
100
1000

50
500

0 0
Epoch Epoch
Loss value Loss value

Fig. 7. Results regarding hidden layers when . = ,− = Fig. 8. Results regarding policy update methods when . = ,− =
, = , =

90
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

a sophisticated reward function. Furthermore, we will exploit [14] X. Sun and N. Ansari, “PRIMAL: PRofIt Maximization Avatar
various DQN algorithms for the proposed model to enhance the pLacement for mobile edge computing,” in 2016 IEEE International
Conference on Communications, ICC 2016, 2016.
training step by comparing their effectiveness.
[15] A. Nadembega, A. S. Hafid, and R. Brisebois, “Mobility prediction
model-based service migration procedure for follow me cloud to support
REFERENCES QoS and QoE,” in 2016 IEEE International Conference on
[1] E. Ahmed and M. H. Rehmani, “Mobile Edge Computing: Opportunities, Communications, ICC 2016, 2016.
solutions, and challenges,” Future Generation Computer Systems, 2017. [16] T. Ouyang, Z. Zhou and X. Chen, “Follow me at the edge: Mobility-aware
[2] S. Guan and A. Boukerche, “Design and Implementation of Offloading dynamic service placement for mobile edge computing,” IEEE J. Sel.
and Resource Management Techniques in a Mobile Cloud Environment,” Areas Commun., vol. 36, no. 10, pp. 2333-2345, Oct. 2018.
In Proceedings of the 17th ACM International Symposium on Mobility [17] Z. Gao, Q. Jiao, K. Xiao, Q. Wang, Z. Mo, and Y. Yang, “Deep
Management and Wireless Access (MobiWac '19), 97–102, 2019. reinforcement learning based service migration strategy for edge
[3] S. Guan and A. Boukerche, “A MEC-based Distributed Offloading Model computing,” in Proceedings - 13th IEEE International Conference on
for Ubiquitous and Time-constraint Offloading,” 2019 IEEE/ACM 23rd Service-Oriented System Engineering (SOSE), 2019.
International Symposium on Distributed Simulation and Real Time [18] C. Zhang and Z. Zheng, “Task migration for mobile edge computing using
Applications (DS-RT), pp. 1-8, 2019. deep reinforcement learning,” Future Generation Computer Systems,
[4] S. Guan, R. E. De Grande, and A. Boukerche, “A Multi-Layered Scheme 2019.
for Distributed Simulations on the Cloud Environment,” IEEE [19] J. Hu, G. Wang, X. Xu, and Y. Lu, “Study on Dynamic Service Migration
Transactions on Cloud Computing, vol. 7, no. 1, pp. 5-18, 2019. Strategy with Energy Optimization in Mobile Edge Computing,” Mobile
[5] S. Guan, R. E. De Grande, and A. Boukerche, “An HLA-Based Cloud Information Systems, vol. 2019, p. 5794870, 2019.
Simulator for Mobile Cloud Environments,” 2016 IEEE/ACM 20th [20] Y. Yang, X. Chen, Y. Chen, and Z. Li, “Green-oriented offloading and
International Symposium on Distributed Simulation and Real Time resource allocation by reinforcement learning in MEC,” in Proceedings -
Applications (DS-RT), pp. 128-135, 2016. 2019 IEEE International Conference on Smart Internet of Things, 2019.
[6] S. Wang, J. Xu, N. Zhang, and Y. Liu, “A Survey on Service Migration [21] Y. Wang, H. Zhu, X. Hei, Y. Kong, W. Ji, and L. Zhu, “An energy saving
in Mobile Edge Computing,” IEEE Access, 2018. based on task migration for mobile edge computing,” EURASIP Journal
[7] A. Machen, S. Wang, K. K. Leung, B. J. Ko, and T. Salonidis, “Live on Wireless Communications and Networking, 2019
Service Migration in Mobile Edge Clouds,” IEEE Wireless [22] A. Boukerche, S. Guan and R. E. De Grande, “A Task-Centric Mobile
Communications, 2018. Cloud-Based System to Enable Energy-Aware Efficient Offloading,” in
[8] T. G. Rodrigues, K. Suto, H. Nishiyama, N. Kato, and K. Temma, IEEE Transactions on Sustainable Computing, vol. 3, no. 4, pp. 248-261,
“Cloudlets Activation Scheme for Scalable Mobile Edge Computing with 1 Oct.-Dec., 2018.
Transmission Power Control and Virtual Machine Migration,” IEEE [23] T. Taleb and A. Ksentini, “An analytical model for follow me cloud,” in
Transactions on Computers, 2018. GLOBECOM - IEEE Global Telecommunications Conference, 2013.
[9] A. Boukerche, S. Guan, and R. E. De. Grande, “Sustainable offloading in [24] A. MacHen, S. Wang, K. K. Leung, B. J. Ko, and T. Salonidis, “Poster:
mobile cloud computing: Algorithmic design and implementation,” ACM Migrating running applications across mobile edge clouds,” in
Computing Surveys (CSUR), vol. 52, no. 1, p. 11, 2019. Proceedings of the Annual International Conference on Mobile
[10] P. Mach and Z. Becvar, “Mobile Edge Computing: A Survey on Computing and Networking, MOBICOM, 2016.
Architecture and Computation Offloading,” IEEE Communications [25] N. Xia, M. Tang, J. Jiang, D. Li, and H. Qian, “Energy Efficient Data
Surveys and Tutorials, 2017. Transmission Mechanism in Wireless Sensor Networks,” International
[11] S. Wang, R. Urgaonkar, T. He, M. Zafer, K. Chan, and K. K. Leung, Symposium on Computer Science and Computational Technology, 2008.
“Mobility-induced service migration in mobile micro-clouds,” in [26] R. S. Sutton and A. G. Barto, “Reinforcement Learning: an Introduction,”
Proceedings - IEEE Military Communications Conference MILCOM, Cambridge, Mass: MIT Press, 2018.
2014.
[27] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D.
[12] S. Wang, R. Urgaonkar, M. Zafer, T. He, K. Chan, and K. K. Leung, Wierstra, and M. Riedmiller, “Playing atari with deep reinforcement
“Dynamic service migration in mobile edge-clouds,” in Proceedings of learning,” arXiv preprint arXiv:1312.5602, 2013.
14th IFIP Networking Conference, 2015.
[28] V. Mnih, K. Kavukcuoglu, D. Silver, A.A. Rusu, J. Veness, M.G.
[13] S. Wang, R. Urgaonkar, M. Zafer, T. He, K. Chan, and K. K. Leung, Bellemare, A. Graves, M. Riedmiller et al., “Human-level control through
“Supplementary Materials for Dynamic Service Migration in Mobile deep reinforcement learning”, Nature, vol. 518, no. 7540, pp. 529-533,
Edge-Clouds”. 2015.

91
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Real-time Feedback in Node-RED for


IoT Development: An Empirical Study
Diogo Torres João Pedro Dias André Restivo Hugo Sereno Ferreira
DEI INESC TEC & DEI LIACC & DEI INESC TEC & DEI
Faculty of Engineering Faculty of Engineering Faculty of Engineering Faculty of Engineering
University of Porto University of Porto University of Porto University of Porto
Porto, Portugal Porto, Portugal Porto, Portugal Porto, Portugal
diogo.torres@fe.up.pt jpmdias@fe.up.pt arestivo@fe.up.pt hugo.sereno@fe.up.pt

Abstract—The continuous spreading of the Internet-of-Things allowing users with little to none technical knowledge to
across application domains, aided by the continuous growth on develop and configure their IoT systems [5], [6]. Among those,
the number of devices and systems that are Internet-connected, the most common are visual programming solutions, generally
created both a rise in the complexity of these systems and made
noticeable a lack of human resources with the expertise to design, in the form of Visual Programming Languages (VPLs), as
develop and maintain them. Recent works try to mitigate these they were already used in tasks such as in the development
issues by creating solutions that abstract the complexity of the of Programmable Logic Controllers (PLCs) systems [7]. IoT
systems, such as using visual programming languages. Node-RED, systems are commonly created and managed using VPLs, either
as one of the most common solutions for the visual development at the fog or cloud tiers [8], by allowing users to define the
IoT systems, stills has several limitations, such as the lack of ob-
servability and inadequate debugging mechanisms. In this work, system’s behavior by manipulating visual elements rather than
we address some of these limitations by enhancing Node-RED text. Among these solutions, we can highlight Node-RED as
with new features that improve the user’s system development, one of the most used ones [9], being an open-source tool that
debugging, and understanding tasks. We proceed to empirically allows the mashup of hardware devices, APIs, and third-party
evaluate the impact of these enhancements, concluding that, services, in a hybrid text-visual programming approach [10].
overall, such enhancements reduce the development time and
the number of failed attempts to deploy the system. To compose the rules of the system, Node-RED allows the
Index Terms—Internet-of-Things, Node-RED, Software Engi- creation of flows connecting nodes that represent the various
neering, Monitoring, Debugging elements of the system (e.g., sensors, and actuators). Node-
RED provides not only a VPL but also a runtime environment
I. I NTRODUCTION that executes the constructed flows.
Internet-of-Things (IoT) systems permeate our daily lives As the system complexity evolves, understanding what is
by making everyday objects available everywhere and anytime. happening becomes harder, as Node-RED lacks in presenting
This pervasiveness of smart objects creates a foundation for a feedback to the user during development [11]. This makes it
more interactive environment between things and humans, with difficult for a user to create and modify existing rules while
the potential (and promise) of improving the quality of life. ensuring that changes do not break the expected behavior [12].
IoT’s unique characteristics — communication, identification, Node-RED lacks mechanisms to inspect the inner workings of
and interactivity — are what makes them so useful in applica- a node, to inject or modify messages during runtime, or even
tions such as home automation, transportation, manufacturing, to verify if connections between nodes will not raise runtime
healthcare, farming, and retail [1], [2]. The growing number errors. Its debug capabilities are also inadequate, relying on
of IoT systems and their increasing complexity (which can “log to console” strategies — leading to the proliferation of
be observed in aspects such as the multitude and continuous non-essential debug nodes in the flows. Every change the user
growth in diversity of communication protocols, architectures, makes to the system, even to add debug nodes, requires a new
and development solutions), together with the pervasiveness in deployment. Common mitigation strategies includes external
application domains, has led to several shortcomings, including solutions that provide visualization and monitoring mechanisms
the lack of human resources having the technical knowledge that allow understanding how the system is behaving (making
needed to develop IoT systems [3], [4]. the system observable to a certain degree), mostly through log
In an attempt to tackle both the growing complexity of analysis [13], [14].
developing IoT systems and the lack of specialized resources, Even considering other less popular solutions for developing
several approaches have been proposed (by both industry and IoT systems, they share similar downsides, including lack of
academia) empowering the so-called end-user development, observability (feedback between the development environment
and the system under development) and weak, or nonexistent,
mechanisms to properly debug the system (most rely solely
978-1-7281-7343-6/20/$31.00 ©2020 IEEE upon debug messages). While various research works approach

92
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Node-RED and its issues, most of them do not focus on the


developing experience, but rather on other (internal) aspects
such as how to distribute the runtime environment [15]–[17].
Considering these issues, shared across the visual program-
ming for IoT landscape, we propose a set of enhancements
to mitigate them. We also implemented a proof-of-concept
(named NODERED - CAULDRON), built on top of the original
Node-RED, which fulfills these enhancements. NODERED -
CAULDRON allows users to check the runtime state of the
system (i.e., observing input/output of a given node), use debug- Fig. 1: The Debug Node, represented on the left. On the right,
ging mechanisms like breakpoints (i.e., “pause” the incoming a Debug sidebar where all the information provided by these
messages of a given node and understand each message that nodes is displayed.
flows through it), and perform runtime modifications (i.e., inject
and change messages).
To empirical assert how, and how much, these enhancements difficult. Nonetheless, some authors use a similar approach by
impact the performance of users when building, evolving and implementing their own debug nodes but with some external
maintaining IoT systems, an experimental phase followed where components to process the information coming from them.
20 participants had to carry out a set of tasks in the two different Ancona et al. [20] describe a way to implement runtime moni-
Node-RED versions (original and enhanced with our proof-of- toring on Node-RED with trace expressions, by instrumenting
concept). The overall results show that the added enhancements the source code of the program that needs to be verified, and
improve users’ ability to develop IoT systems and ease the adding a new monitor node in Node-RED that captures all
process of understanding how the system is behaving. the relevant events for the domain in use. Their approach
This paper is structured as follows: In Section II, an allows dynamically checking if the Node.JS APIs are being
overview of the existent works on improving Node-RED is correctly used, offering an interface that allows the use of these
given, along with a summarized review on VPLs from other mechanisms for runtime verification of Node-RED components.
application domains, presenting both already addressed and
open challenges. Section III presents our approach to enhancing B. Similar Work
Node-RED, as well as some details of the proof-of-concept Visual programming approaches have been explored in other
developed. In Section IV, the experimental phase is documented domains to ease development, such as game development and
along with an analysis and discussion of the results obtained. graphical animation. We provide an overview of two platforms
Finally, some closing remarks are given in Section V. that inspired our modifications to Node-RED.
1) Blender: an open-source 3D creation tool which packs a
II. R ELATED W ORK visual programming editor — Nodes Editor (Fig. 2). This editor
The use of VPLs has become widely adopted in the IoT area provides real-time feedback about the resulting 3D objects
to improve the development of these systems. However, most during development, and supports three types of nodes: Shader
of these tools focus on the rapid prototyping of systems, having Nodes, Composite Nodes, and Texture Nodes [21].
fundamental gaps in the mechanisms that allow the evolution
and maintenance of these systems, including debugging and
runtime observation mechanisms. The only way to debug the
system is through the source level debug information provided,
which requires a more significant effort than the original visual
programming task [18].
A. Node-RED Observability
Node-RED comes natively with debug nodes that display
messages in the debug sidebar within the editor, as shown in
Fig. 1. Alongside each message, the debug sidebar includes
information about the received time of the message and which
Debug node sent it [19]. These debug nodes have a button
that can be used to enable or disable its output. This type
of implementation is also referred by Zodik et al. [18] that
Fig. 2: An example of visual methaphors on Blender [21].
said that this set of debug nodes can generate progress reports
during execution that include content and structure messages
that are displayed to the user. All nodes in Blender have a similar structure; it is possible
However, in complex systems, adding several debug nodes to observe their information — such as their title, input/outputs,
that write to the same sidebar makes the debugging process very settings, and values — in different visual metaphors like values

93
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

or plots, as shown in Fig. 2. The socket’s input and output possible to change it). Thus, it is impossible to inject
values from the node are color-coded according to the data faulty messages to verify if the system reacts as expected.
type (e.g., color, numeric, vector, or shader) it handles. Exploration: Every change in the system requires its deploy-
2) Unreal Engine: Unreal Engine’ Blueprints feature allows ment to production, including adding debugging nodes.
the creation of gameplay elements through classes which Solutions such as Unreal Engine debugging capabilities
are built by wiring function blocks and property references allow debugging without re-deploying the system.
together [22], [23]. Ancona et al. [20] and similar solutions attempt to address
some of these challenge. Though they are capable of providing
real-time information about the running system (which can be
leverage to provide self-healing capabilities [9]), they do not
provide any real-time visual feedback in the Node-RED editor.
No existing research, to the best of our knowledge, was found
that provides any kind of structural correctness verification in
design time, neither any feature regarding runtime modification
and exploration, in visual development solutions for IoT.
III. E NHANCING N ODE -RED
Inspired by the existent features on VPLs from other domains
of application, we consider that these issues could be addressed
in visual programming solutions for IoT. We consider that this
Fig. 3: Blueprints debugging mode where it is possible to would improve the development of IoT systems by reducing
watch the current path of the messages (i.e., the highlighted the development time, the number of bugs created during
one) [22], [23]. development, and overall system maintenance. We start by
presenting a motivational scenario depicted in Fig. 4.
Blueprints also provides a debugger capable of pausing the
execution of the game and step through the graph nodes by
using breakpoints. This debugger allows seeing the current
flow of the messages (and their value), as well as other node’s
variables, as shown in Fig. 3. It also provides a Call Stack and
an Execution Trace that shows a list of executed nodes and
allows further runtime inspection. Fig. 4: Whenever the temperature falls below 22ºC, the heating
system must turn on until the temperature reaches that value.
C. Discussion
There is a considerable amount of visual programming We modified Node-RED to augment the system’s observabil-
solutions for IoT [24]; however, they are typically limited in ity and improve the feedback-loop between the development
ways similar to Node-RED. For instance, none of them provides environment and its runtime, trying to improve the users’ ability
immediate feedback during development or at runtime [25]. to build, evolve, and maintain IoT systems.
Other known limitations include (but are not limited to): NODERED - CAULDRON focuses on addressing some of the
Observability: Nonexistent way to visualize the information identified missing features in Section II: (1) Observability,
that flows through the system in the development com- by providing the ability to show the information which flows
ponent. Both Blender Nodes editor and Unreal Engine through the nodes using different visual metaphors, (2) Run-
provide such features to other domains of application. time Modification, by allowing the injection of messages
Structural Correctness: There is no verification if the during runtime, and (3) Exploration, by enhancing the debug
connections between nodes will not raise runtime errors, capabilities through breakpoints on each node without the need
which implies that many faults will only emerge after for re-deployments. With our approach, each node presents
deployment. As some errors may only appear in specific each input’s messages; in nodes without any input, the output
conditions, it makes it harder (even impossible) to assert is shown. Thus, all the information flowing through all nodes
the system correctness (one must let the system run for is observable without the need to add new ones (cf. Fig. 5).
some time in a testbed or simulation setup to check for Using a Switch node as example, we can observe the added
potential errors [26]). The type system of Blender Nodes features in detail (cf. Fig. 6).
reduces these need by checking for valid connections Leveraging the already existing communication mechanism
between nodes in design time. (between the runtime and UI) a new topic was added that
Runtime Modification: They do not allow changing mes- allows showing the runtime data (i.e, messages between nodes)
sages at runtime to check the system’s behavior (e.g., in the UI. Using this additional communication channel, we can
when a temperature sensor emits a reading, it should be visualize incoming messages through two different plots (i.e, if

94
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 5: The flow from Fig. 4 with the NODERED - CAULDRON ’s features, namely the message plots and extra debug options.

the message’s payload type is a number, it displays a line plot. two messages is calculated, and this value is used to set the
Otherwise, a scatter plot). These plots let the user perceive the pace). We further enhanced the Debug Node with the same
values received, or at what pace they are coming. By allowing message’s visualization capabilities of the other nodes.
this communication to be bidirectional, and applying the same
strategy, we can inject messages into the runtime to test specific IV. E XPERIMENTS AND R ESULTS
system’s behavior.
We also added extra debugging capabilities, such as break- Our goal is to verify if these changes impact the development
points. This allows the user to “pause” incoming messages process. We carried a controlled experiment to compare the
for a given node by queuing them. The user can also step performance and behavior of two developer groups [27], [28].
forward one message at a time and change its payload. It is We hypothesized that these characteristics would improve the
also possible to clear all the queued messages. When the node ability of users to successfully build, evolve, and maintain IoT
is “unpaused”, the queued messages are released in the “same” systems faster, easier, and with fewer errors. Specifically, we
frequency that were received (i.e, the time between the last aim to answer the following research questions:
RQ1 Would users with increased exposure to real-time infor-
mation about the running system build and manage it
faster?
RQ2 Does providing users with real-time feedback increase
their ability to understand and change existing systems?
RQ3 Is an IoT visual programming environment, able to reduce
human-induced errors during development by providing
real-time feedback?

A. Experimental Parameters
We started by doing a preliminary assessment of our
procedure with two participants having distinct backgrounds:
(1) a casual Node-RED user and (2) a user with no previous
Fig. 6: An example with a Switch node in NODERED - experience in Node-RED. After which, we set out to adopt the
CAULDRON. On the top right, there is a Debug Button (1) that
following parameters for the full study:
allows to expand/collapse the messages’ plot (2) and the Show
1) Experiments: They consisted of (a) debugging, (b) im-
More button (3). This Show More button allows visualizing
proving, and (c) creating an IoT system using Node-RED;
functionalities related to the messages and breakpoint system.
hence, development experience and basic familiarity with IoT
For messages, it shows the current message to process (5), and
were required;
buttons (4) to access input and output messages’ history, clear
this history, and injecting messages in the current node. For 2) Participants: The sample size was twenty participants,
the breakpoint system (6), it allows pausing/starting message all of them, final-year computer science students with at least
processing (queuing the incoming messages) and process each basic IoT knowledge, but with no Node-RED experience;
message at a time by using the step button. This step button 3) Duration: To avoid participants’ overload and at the same
also allows the modification of the current message. The trash time providing a reasonable time to finish all of the tasks, the
button clears the breakpoint’s queue. duration of the experiment was set to 90 minutes, with a 25
minutes timeout per task;

95
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

4) Procedure: We made usage of a mix of quasi- ET1. A debugging task with a set of rules. The system
experimental with ethnographic research. The population was was capable of keeping the soil at a certain moist and
split into two groups, GA and GB, with different treatments: temperature level. For this, the user was able to control
GA used unmodified Node-RED, and GB used our tool. As (a) a heating system, (b) an irrigation mechanism, and
there were no guarantees of equal technical knowledge among (c) automatic windows. These were controlled by a
groups, two control tasks (CT) were performed to provide basic humidity/temperature sensor. These rules had some bugs
familiarity with the tool. Following, three experimental tasks related to (a) erroneous conditions, (b) wrong commands
(ET) were given to each group, viz. (a) debug, (b) improve, sent to the actuators, and (c) mismatched field accessors;
and (c) create a system from scratch. In these three tasks, ET2. An improvement task, where the user is responsible for
GB was provided with additional documentation regarding the adding a new feature to the current system, by using new
available new features. All tasks were solved in the same order, devices (both sensors and actuators): (a) the status of
with a small time break between them; the UV lamps should be adjusted according to weather
5) Environment: All experiments were conducted in a forecasts, and (b) if the UV lamps’ are on, the window
remote environment1 . The needed tools were hosted in a private should be closed;
virtual server. Video call software was used to communicate and ET3. An implementation task, where the user must create a
provide access to the participant’s screen. With this procedure, simple smart home system. Two different types of rules
it was possible to observe and take notes on the participant’s were given: (a) the lights should turn on when there is
behavior, clarify some doubts related to the tasks, and verify movement in the kitchen, and (b) every day at a given
if a certain outcome was correct; hour, the water heater and the coffee machine should be
6) Data: For both treatments we recorded: (a) the time taken turned on (recurrent rule).
to reach the solution; (b) the number of deployments made;
and (c) the number of verification requests (i.e., every time the C. Results
user thought the task was finished). For GB, the number of We now provide an analysis of the results for both the
clicks in each new functionality was also recorded; Control and Experimental Tasks. We discarded CT1, as it was
7) Post-test: A survey was carried to assess overall partici- mostly used as a sanity check.
pant’s experience, and to collect improvement suggestions. For 1) Control Task: We used CT2 to verify if there was a
this, we resorted to five statements evaluated using a Likert- statistical difference between the two experimental groups by
scale, three related to existing functionalities in NODERED - measuring the time spent and number of deployments required,
CAULDRON , and two regarding future improvements. We as presented in Table I.
slightly adapted some questions to match the specificities of We start with the Levene’s test verifying if both groups are
different treatments. from populations with equal variances. As the obtained ρ-value
B. Tasks is 0.54 for time, and 0.75 for the number of deployments, we
cannot reject the null hypothesis (i.e., both groups present
To make it possible to run the experiments with equal equal variances). A Shapiro-Wilk test verifies if each of the
operating conditions, a sensor/actuator simulator was developed groups were drawn from populations with a normal distribution.
(having a deterministic behavior) to provide real-time data Since the resulting ρ-value is above the significance level
(continuous flow of messages). This simulator implements (time: ρ(GA) = 0.69 and ρ(GB) = 0.61; deployments:
mechanisms to validate the correctness of the experimental ρ(GA) = 0.55 and ρ(GB) = 0.16), we also fail to reject the
outcomes. The CTs were: null hypothesis (i.e., both groups present a normal distribution
CT1. A preliminary task where Node-RED is introduced in the results). Ergo, we assume that both samples come from
alongside the process of creating a simple flow. It shows normally distributed populations with equal variances.
how to manually inject messages in a flow (using the We then use a Student’s t-test for assessing the following
Inject node), parse them with custom JavaScript (using hypothesis related to time, viz. H0 : both groups needed a similar
the Function node), and then display them in the amount of time to complete the task, and H1 : there exists a
sidebar (using the Debug node); significant difference in the average time for each group to
CT2. A task were data from seismometers must be used to complete the task. Concerning deployments, we assume H0 :
activate an alarm, depending on the inferred earthquake’s
magnitude. This task introduced new nodes and logic
(e.g., read data from sensors, add intermediate logic, send TABLE I: Time spent and number of deployments in CT2.
commands to the actuators) to be used in later tasks.
Grp N Mean σ Med S-W (ρ) Levene (ρ) t-test (ρ)
The first two ETs were both based on a smart farming sce-
Time

A 10 8:30 2:00 9:09 0.69


nario where a system would automatically control a strawberry 0.54 > 0.99
B 10 8:30 2:15 8:54 0.61
plantation inside a greenhouse. A third task focused on the
Deploys

development of a simple smart home system: A 10 4.00 1.25 4.00 0.55


0.75 0.87
B 10 3.90 1.37 3.50 0.16
1 Due to the COVID-19 pandemic.

96
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

both groups made a similar amount of deployments to complete singled out one in ET2, forcing us to discard it (cf. Fig. 8b),
the task, and H1 : there exists a difference in the average of resulting in a ρ? -value of 0.03. This allows us to conclude
deployments made to each group to complete the task. that the experimental group does present a statistical difference
We observe that the time spent has a ρ-value=0.997 and the when adding new features to an existing system concerning
number of deployments has a ρ-value=0.866, failing to reject time. Regarding the other tasks, we believe that they might
H0 , and thus be forced to consider that there is no statistical have not captured a sufficient degree of difficulty/complexity
difference between the two groups, as intended (cf. Fig. 7). to evidence substantial differences and/or the sample size was
insufficient. We do consistently observe a lower mean and
median for all tasks in the experimental group.

TABLE III: Number of deployments in ET1–3.

Task Grp Mean σ Med t-test (ρ)

A 7.90 3.60 7.50


ET1 < 0.01
B 3.00 1.05 3.00

A 4.30 2.11 4.50


ET2 0.01
B 2.10 1.29 2.00

A 4.50 2.07 4.00


ET3 0.04
B 2.70 1.49 2.50

(a) (b)
Fig. 7: Time (a) and number of deployments (b) in CT2.

2) Experimental Tasks: Using the same hypotheses de-


scribed in the Control Tasks, we present the results of the
Experimental Tasks, together with a qualitative analysis.

TABLE II: Time spent in ETs

Task Grp Mean σ Med t-test (ρ) (a) ET1 (b) ET2 (c) ET3
A 12:53 5:34 12:17 Fig. 9: Number of deployments in ET1–3.
ET1 0.75
B 12:08 4:33 11:36

ET2
A 8:13 2:10 8:34
0.30 (0.03? )
Deployments: All experimental tasks present ρ-values lower
B 6:57 3:05 5:47 than the significance level (0.05). This allows us to reject the
A 8:34 2:32 8:12 null hypothesis and accept there is a significant difference in
ET3 0.47
B 7:49 1:59 8:05 the average number of deployments made between the groups,
with the experimental performing fewer attempts.
Comparing the mean and median of the number of deploy-
ments to reach the solution (cf. Table III), there is a clear
tendency for the experimental group to need fewer deployments
— nearly half compared to the control group. This aligns with
our initial hypothesis since every time the user needs to add
new debug nodes in the control group, they are forced to deploy.
On the other hand, the experimental group was presented with
real-time feedback, thus decreasing such need.
Verification Requests: A verification request occurred every
time a participant regarded their task as completed. The
(a) ET1 (b) ET2 (c) ET3
statistical analysis allow us to reject the null hypothesis on both
Fig. 8: Time spent in ET1–3. ET2 and ET3 (cf. Table IV). Regarding the construction and
evolution tasks, we conclude that there is a significant difference
Time: Analyzing the time spent for the three tasks and between groups concerning their subjective perception of task
the results from the t-test (cf. Table II), we were initially completion, as the experimental group required fewer attempts.
unable to reject the null hypothesis for all tasks. We started Behavior: We observed that the experimental group, espe-
by concluding there are no relevant differences between the cially during ET1, changed their debugging strategy by focusing
two groups (cf. Fig. 8). However, a Grubb’s test for outliers on visualizing and understanding the messages in the system

97
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE IV: Number of verification requests in ET1–3. NO DE R E D - CAU L D RO N NO DE - R E D

Q5 3 4 3 Q5 2 8

Task Grp Mean σ Med t-test (ρ) Q4 1 4 5 Q4 2 8

Q3 5 2 3 Q3 1 2 2 5

A 1.50 0.53 1.50 Q2 1 2 4 3 Q2 1 2 5 2


ET1 0.33 Q1 3 7 Q1 1 4 5
B 1.80 0.79 2.00
S1 2 8 S1 1 1 8

A 1.50 0.53 1.50 Strongly Disagree Disagree Neither Agree Strongly Agree
ET2 0.05
B 1.10 0.32 1.00 Fig. 11: Results of the survey post-test.
A 1.80 0.92 1.50
ET3 0.04
B 1.10 0.32 1.00
showing the input’s messages on each node, (Q2) to the plot
that shows the messages, (Q3) to the breakpoint system, (Q4) in
instead of attempting to understand the underlying logic of having typed connections between nodes, and (Q5) the highlight
each node. This was one of the most interesting observed of the node path of a message. Although only the experimental
phenomena because it represents a change in the participants’ group (GB) used some of the new features, we also asked the
behavior when approaching their tasks. This finding merits control group (GA) if they would like to have had such features.
further study before any major conclusions can be drawn. Interestingly, we have found a very close match between the
3) Experimental Group Feature Usage Analysis: After two groups (cf. Fig. 11). The highest divergence was found in
aggregating the results for each task (cf. Fig. 10), we conclude Q4 and Q5, which referred unavailable features on both groups
that the most used features in NODERED - CAULDRON were (i.e., these were not implemented in NODERED - CAULDRON).
those related to the visualization of the messages, i.e., (1) plot, This can be explained considering that the experimental group
(2) detailed message, and (3) history. In terms of usage by task was exposed to the experience of having real-time feedback
(cf. Table V), we observe an higher mean and median for ET1, during development, and not feeling the need of these extra
followed by ET3 and then ET2. These results were expected, features. In Q3, the results were similar, since in our tool
since on ET1 participants spent more time in understanding participants ended up not using breakpoints. We conclude that
the system, and consequently the messages that flow through it. most participants seem to want the functionalities described
In ET2, the extra features were not used as much because the in each question. Finally, the results of S1 suggest that the
participants already understood the system and did not feel the experimental group had a more enjoyable experiment.
need for a deeper exploration. ET3 was focused on constructing D. Discussion
a new system, which results in the observed higher values as
they attempted to understand the messages’ flow. Taking into account the experimental results presented in
Section IV-C, we now revisit our research questions:
RQ1. Would users with increased exposure to real-time
information about the running system build and manage it
faster? Both groups spend a similar amount of time in solving
the tasks, with a statistical significant difference observed
on improving systems. We also note that experimental group
presented consistently smaller mean and median values;
RQ2. Does providing users with real-time feedback increase
their ability to understand and change existing systems?
According to the number of deployments performed per task
together with the qualitative analysis, we can conclude that
in a system with higher feedback, users tend to perform less
Fig. 10: Clicks on NODERED - CAULDRON functionalities.
attempts of deployment thus pointing that these features make
the system easier to change;
4) Post-test Survey: To evaluate the participants’ experience, RQ3. Is an IoT visual programming environment, able to
we performed a post-test survey composed of six questions, one reduce human-induced errors during development by providing
about the general satisfaction (S1), and five concerning each real-time feedback? By analyzing the number of deployments
one of the functionalities (Q1–Q5), namely: (Q1) is related to and attempts, we see a substantial difference where users in
the experimental group have less need to deploy and more
confidence in their solution (i.e., they required less attempts to
TABLE V: Total clicks aggregated by ET1–3. achieve a successful task completion). This can be specially
Task Mean σ Med Min. Max.
useful in more sensible systems, where deployments should be
kept to a minimum.
ET1 54.60 34.36 43.00 21 130 In summary, there is significant evidence that an environment
ET2 17.50 12.64 12.50 1 35
with real-time feedback and improved debug capabilities
ET3 23.10 15.38 21.00 4 61
impacts the ability to build, maintain and improve IoT systems.

98
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

V. C ONCLUSIONS [6] F. Corno, L. De Russis, and A. M. Roffarello, “A high-level approach


towards end user development in the iot,” in Proceedings of the 2017 CHI
IoT systems and their application across application domains Conference Extended Abstracts on Human Factors in Computing Systems,
with different constrains and responsibilities boosted their ser. CHI EA ’17. New York, NY, USA: ACM, 2017, p. 1546–1552.
[7] L. Kumar, L. Jetley, and A. Sureka, “Source Code Metrics for
heterogeneity and complexity at a mostly unprecedented scale. Programmable Logic Controller (PLC) Ladder Diagram (LD) Visual
The tremendous gap of the qualified personnel to design, Programming Language,” International Workshop on Emerging Trends
develop and maintain these systems has pushed both industry in Software Metrics (WETSOM), no. Ld, 2016.
[8] A. Seitz, F. Thiele, and B. Bruegge, “Fogxy: An architectural pattern
and academia to create new ways to develop IoT systems for fog computing,” Proceedings of the 23rd European Conference on
that abstract the system’s complexity at different degrees. One Pattern Languages of Programs, 2018.
of those approaches, already used for programming logical [9] J. P. Dias, B. Lima, J. P. Faria, A. Restivo, and H. S. Ferreira,
“Visual self-healing modelling for reliable internet-of-things systems,”
controllers (PLC’s), was visual programming. Amongst those, in Computational Science – ICCS 2020. Cham: Springer, 2020, pp.
Node-RED appeared as one of the most common solutions to 357–370.
develop IoT systems. Node-RED, despite its popularity, has [10] Node-red, “Flow-based programming for the Internet of Things,” 2017,
accessed November 2019. [Online]. Available: https://nodered.org/
several drawbacks, such as: (1) lack of observability in both [11] A. Aguiar, A. Restivo, F. F. Correia, H. S. Ferreira, and J. a. P. Dias, “Live
flows and internal nodes logic, (2) no-type checking on the software development: Tightening the feedback loops,” in Proceedings
connections between nodes, (3) no way to proper test the system of the Conference Companion of the 3rd International Conference on
Art, Science, and Engineering of Programming, ser. Programming ’19.
— no way to change messages during runtime, (4) no feedback USA: ACM, 2019.
about the running systems — no way to inspect messages and [12] G. J. Holzmann, “The logic of bugs,” ACM SIGSOFT Symposium on the
their payloads and (5) poor to none debug mechanisms. In Foundations of Software Engineering, pp. 81–87, 2002.
[13] M. Bajer, “Building an iot data hub with elasticsearch, logstash and
this work we attempted to overcome some of these drawbacks kibana,” in 2017 5th International Conference on Future Internet of
by enhancing the existent Node-RED with new features. To Things and Cloud Workshops (FiCloudW), Aug 2017, pp. 63–68.
assert how such features would impact the development of IoT [14] S. Dharur and K. Swaminathan, “Efficient surveillance and monitoring
using the elk stack for iot powered smart buildings,” in 2018 2nd
systems a proof-of-concept was developed and an empirical International Conference on Inventive Systems and Control (ICISC),
evaluation followed with 20 participants2 . We conclude that the Jan 2018, pp. 700–705.
added enhancements improve the overall development process, [15] M. Blackstock and R. Lea, “Toward a distributed data flow platform
for the Web of Things (Distributed Node-RED),” ACM International
with a significant reduction of the number of failed attempts to Conference Proceeding Series, vol. 08-October, pp. 34–39, 2014.
deploy the systems without fulfilling its requirements. Further, [16] N. K. Giang, M. Blackstock, R. Lea, and V. C. Leung, “Developing
the overall system development time was lower than with the IoT applications in the Fog: A Distributed Dataflow approach,” in 5th
International Conference on the Internet of Things, 2015, pp. 155–162.
normal Node-RED. As future work we consider to address [17] M. Blackstock and R. Lea, “FRED: A hosted data flow platform for the
some of the remaining challenges that were not addressed IoT,” in Proceedings of the 1st International Workshop on Mashups of
so far, namely, considerations about typed connections (add Things and APIs, MOTA 2016, 2016.
[18] G. Zodik, N. Il, S. J. Todd, and W. Gb, “Monitoring execution of an
type-safety to Node-RED) and further improve the ways of herarchical visual program such as for debugging amessage flow,” 2004.
testing and asserting the correctness of the system. [19] Node-RED, “User guide: Node-red,” https://nodered.org/docs/user-guide/,
[Online; January 2020].
ACKNOWLEDGMENTS [20] D. Ancona, L. Franceschini, G. Delzanno, M. Leotta, M. Ribaudo, and
F. Ricca, “Towards runtime monitoring of node.js and its application to
This work was partially funded by the Integrated Masters the internet of things,” in Electronic Proceedings in Theoretical Computer
Science, EPTCS, vol. 264, 2018, pp. 27–42.
in Informatics and Computing Engineering of the Faculty [21] Blender, “Blender manual,” https://docs.blender.org/, [Online; January
of Engineering, University of Porto (FEUP) and the Por- 2020].
tuguese Foundation for Science and Technology (FCT), ref. [22] U. Engine, “Unreal engine — the most powerful real-time 3d creation
platform,” https://www.unrealengine.com/, [Online; January 2020].
SFRH/BD/144612/2019. [23] Epic Games, “Blueprint debugging example,” https://docs.unrealengine.
com/en-US/Engine/Blueprints, [Online; January 2020].
R EFERENCES [24] P. P. Ray, “A Survey on Visual Programming Languages in Internet of
Things,” Scientific Programming, vol. 2017, 2017.
[1] D. Miorandi, S. Sicari, F. De Pellegrini, and I. Chlamtac, “Internet of [25] J. P. Dias, J. P. Faria, and H. S. Ferreira, “A reactive and model-
things: Vision, applications and research challenges,” Ad Hoc Networks, based approach for developing internet-of-things systems,” in 2018
vol. 10, no. 7, pp. 1497–1516, 2012. 11th International Conference on the Quality of Information and
[2] G. Gardašević, M. Veletić, N. Maletić, D. Vasiljević, I. Radusinović, Communications Technology (QUATIC), 2018, pp. 276–281.
S. Tomović, and M. Radonjić, “The iot architectural framework, design [26] J. P. Dias, H. S. Ferreira, and T. B. Sousa, “Testing and deployment
issues and application domains,” Wireless personal communications, patterns for the internet-of-things,” in Proceedings of the 24th European
vol. 92, no. 1, pp. 127–148, 2017. Conference on Pattern Languages of Programs, ser. EuroPLop ’19. New
[3] A. Taivalsaari and T. Mikkonen, “A Roadmap to the Programmable York, NY, USA: ACM, 2019.
World: Software Challenges in the IoT Era,” IEEE Software, vol. 34, [27] A. Santos, M. Oivo, and N. Juristo, “Moving beyond the mean: Analyzing
no. 1, pp. 72–80, 2017. variance in software engineering experiments,” in Product-Focused
[4] Microsoft, “Iot signals – summary of research learnings,” Microsoft, Software Process Improvement, M. Kuhrmann, K. Schneider, D. Pfahl,
Tech. Rep., 2019. S. Amasaki, M. Ciolkowski, R. Hebig, P. Tell, J. Klünder, and S. Küpper,
[5] A. S. Lago, J. P. Dias, and H. S. Ferreira, “Conversational interface Eds. Springer, 2018, pp. 167–181.
for managing non-trivial internet-of-things systems,” in Computational [28] B. Kitchenham, L. Madeyski, and P. Brereton, “Problems with statis-
Science – ICCS 2020. Cham: Springer, 2020, pp. 384–397. tical practice in human-centric software engineering experiments,” in
Proceedings of the Evaluation and Assessment on Software Engineering,
2 A replication package containing both the source code and experimental ser. EASE ’19. New York, NY, USA: ACM, 2019, p. 134–143.
materials is available at http://doi.org/10.5281/zenodo.3981547

99
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

MEDART-MAS: MEta-model of Data Assimilation on Real-Time


Multi-Agent Simulation
Bassirou Ngom1,2 , Moussa Diallo1 and Nicolas Marilleau2
1 Université Cheikh Anta Diop, Dakar, Senegal
2 UMI UMMISCO-IRD/Sorbonne Université, Paris, France

Abstract—In modeling and simulation process, data plays an climatology, etc.) [7].
important role. Data is required to validate the model and
to experiment scenarios. It is also necessary for fitting and In the context of real-time systems, the multi-Agent ap-
calibrating model parameters. In the case of online simulation,
data assimilation approaches make possible to inject data into proach is the most preferred for modeling complex systems
simulations and to recalibrate simulations based on real-time and for understanding the emergence phenomena [9]. Indeed,
data. This paper addresses the challenge of assimilating data with this approach, several agents will be able to manage the
into an agent-based simulation by promoting a novel architecture time constraints of the system and participate in its dynamics at
dedicated to data assimilation. Few improvements have been the same time [4] [5]. However, depending on the field, several
made to adapt Multi-Agent Simulations to real-time data as-
similation. The architecture is designed to be generic enough assimilation methods have been developed in the literature.
to allow wild diversity of case studies. We propose a meta- These methods can be split into 3 groups: DA with sequential
model of data assimilation and implement a toolkit based on methods, variational methods or machine learning methods.
the GAMA simulator. Finally, we use temperature data to test The work that we present in this paper do not focus on
the implementation of a simple use case.
a particular assimilation method. We propose a meta-model
Keywords —Data Assimilation, Real-time System, agent-based
to ease data assimilation in real-time multi-agent simulation.
simulation, Dynamic Data-Driven Simulation, GAMA simulator The architecture of the proposed meta-model is designed to
be generic enough to address various simulation domain. The
main goal is to help modelers, whatever their study case and
simulator (GAMA, NetLogo, ...), to implement the appropriate
I. I NTRODUCTION assimilation methods. In this paper we present :
• The architecture of the proposed Meta-model of Data
Simulation have been used to study, understand and predict
complex systems behavior. Data play an important role in Assimilation on real-time multi-Agent simulation named
the modeling and simulation processes. Data is used in the MEDART-MAS.
• The implementation of the proposed MEDART-MAS us-
models design, validation and to try/test “what if” scenarios.
The simulation also uses the data to calibrate the model ing the agent-oriented language GAMA(GISAgent-based
parameters in order to reduce the gap between simulation Modeling Architecture) simulator.
• A simple use case to validate the implementation of the
results and actual observations [1]. Nevertheless, combining
system observations (the data) with running simulation can MEDART-MAS on simulator.
increase the accuracy of these models. This combination can The rest of the paper is organised as follows, section II
also improve the results of simulation and react quickly to depicts an overview of work on assimilation and real-time
unexpected phenomenons (in case of wildfire [2] or road simulation. Section III describes data assimilation architecture
traffic regulation [4]). For that, data assimilation seems to be and our propose. Afterwards, it is devoted to present the meta-
an interesting way to inject data into the simulation and the model of data assimilation in real-time multi-agent. Then, the
management of the evolution of the system’s state with real section IV introduces the implementation of the meta-model to
data. GAMA simulator. An simple use case with is describe section
V. Finally, section VI concludes our work.
Data assimilation (DA) is a collection of methods that seek
II. OVERVIEW OF REAL - TIME SIMULATION AND DATA
to combine uncertain models with uncertain data to provide
ASSIMILATION MODELS .
the best estimation of the system state at the given point
in time which observations are available [6]. The challenge A. Real-Time Systems and Multi-Agent Based Modeling
of DA on real-time simulation has always been interested Literature shows that Multi-Agent Systems (MAS) are used
in the scientist community. Therefore, many DA methods to study various real systems [8]. When it is about real-
are promoted and applied to reduce the uncertainty. They time systems, several contributions introduce a new type of
were associated with the application of real-time models, agent, often called Real-Time Agent (RTAgent), which is
mainly in the related scientific field (meteorology, hydrology, more intelligent and autonomous than other agents in the
system. They require real-time responses and must eliminate

978-1-7281-7343-6/20/$31.00 ©2020 IEEE


100
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the possibility of massive communication among agents [9]. with random initial values and different parameter values. Each
These smart agents are responsible for DA into simulation simulation is pondered with a weight recalculated using the
model. This approach has been widely used in the management model correction and real data of the system. This method
of the road traffic system. TraSMAPI (Traffic Simulation provides much more accurate model parameters but is very
Manager Application Programming Interface) is proposed in computationally intensive.
[10]. It is an interface where multi simulator can connect to All of these concepts use DA methods to couple models
use road traffic data. Authors use multi-agent based modeling and observations data.
and a stochastic module to integrate data into the simulation.
In [4], authors present an architecture to control the road traffic
C. Data Assimilation Methods
using online simulation. In this paper, the authors present
a prediction architecture and a control of road traffic based Data assimilation can be described using two approaches:
on data collected in real-time. The way data is injected into sequential approach and variational approach [14]. Sequential
simulation is not specified in the paper, but authors define an approach assumes that all observations come from the past in
integration interface that allows the controller to collect new relation to the analysis(data a priori). It relies on statistical
data. Then probabilistic approaches are used to control traffic studies of the system’s state to statistically determine the state
lights or make a forecast on the road or visualize traffic. A that best suits the observations [7]. Furthermore, these methods
similar method is used in [5]. make it possible to perform an analysis at each time step,where
J. Soler et al. designed in [1] a simulation architecture the data is available, to estimate the actual system’s state. The
using agent-based approach for a real-time system. The role of variational approach assumes that future observations related
these proposed SIMBA(SIstema Multiagente Basado en Artis to the analysis are also usable [14]. Among the most used
—Artis-based Multi-agent System) is to introduce new types assimilation methods, we have:
of autonomous agents. It consists of a multi-agent platform
• Kalman filter
for a real-time agent to perform a real-time task and offering
• Ensemble Kalman Filter
services with time constraints. The SIMBA approach makes
• Extended Kalman Filter
possible to apply the multi-agent paradigm to a real-time
• Particle Filter or Monte-Carlos Sequence
distributed problem for which the multi-agent approach seems
• Variational methods(3D-VAR and 4D-VAR)
to be the most appropriate as a centralized approach. In [9],
Julian et al. proposed a real-time multi-agent system based on All of these assimilation methods contain two parts :
SIMBA. • Prediction: predict system state at a given time t,
This modeling approach allows us to understand the dy- • Correction: based on the past prediction of the system
namics of complex systems in real time. It has been used state and the new observations data of the system, updates
extensively in the community. The advantage of this approach the prediction and corrects the prediction error.
is the decoupling between temporal constraints and the dy-
namic of the system. To manage the data injected into the From the assimilation methods description done above, we
simulation, several concepts are developed like Dynamic Data- noticed that the assimilation model is closely related to the
Driven Simulation or Application system. simulation model either to predict system state or to correct
prediction errors. However, for the multi-agent systems, the
B. DDDS : Dynamic Data-Driven Simulation dynamic of the system is represented by a set of interacting
Nowadays, we are witnessing the emergence of sensors agents. With this type of system, DA becomes more and more
technologies. It would allow us to monitor all systems using complicated using assimilation methods described above.
sensors network and obtain real-time data. This allowed the Wang M. and Hu X. propose in [15] the assimilation of
introduction of new concepts such as Dynamic Data-Driven data sensors for multi-agent simulation of smart environments
Simulation(DDDS) [3], where the simulation is influenced in real-time. They use particle filter as assimilation methods.
by the system’s real data. Dynamic Data Driven Application Though in using particle filters, they estimate the state of the
Systems(DDDAS) [11] concepts allow the possibility to inject system, then the simulation restart is dynamic to take into
data into a running simulation of application and conversely account the new values estimated as an initial condition. This
the ability of an application to manage dynamically the same approach is presented in [16] but in this paper, authors
measurement process(retro-action). Dynamic Data Driven optimized the sampling algorithm of particle filters.
Multi-Agent Simulation (DDDMAS) proposed in [12], is the
link between DDDAS and agent-based simulation. In this III. A RCHITECTURE OF DATA A SSIMILATION ON
case of applications, data can be offered in real time(online) M ULTI -AGENT S IMULATION
or be archival data(standalone). DDDAS improve modeling
methods, increase the analysis and prediction capabilities of In this section we present the position of data assimilation
application simulations [1]. Suzuki and Osogami proposed in model between the simulation and the reality Fig.1.
[13] a real-time DA method using Monte-Carlos Sequences. The architecture in Fig.1 can be subdivided into three parts:
This approach consists to run multiple parallels simulations the real world, the virtual world and the assimilation Model.

101
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

dynamics and the time constraints. Thus, several works have


been devoted to the introduction of a new type of agent that
can participate in the dynamic system and manage to critical
tasks [1] [4]. We use this same approach in this proposal.
In the virtual world, beyond the agents of the system, we
introduce agents called Real-Time Agent(RTAgent). RTAgent
will assume data assimilation in the running simulation.
C. Assimilation Model
The assimilation model is the model where transit data from
the real world to be injected into the virtual world or the
simulation. With the agent-based approach used in the virtual
world, RTAgents can have an assimilation model. It means
that the assimilation model is a behavior of the RTAgent.
RTAgent can interact with the environment and other agents in
the model. It can also connect to the servers to obtain data from
sensors in real time via the assimilation model. In this paper
we propose an assimilation meta-model in Fig. 2 to allow the
injection of data into a running multi-agent simulation.
In this meta-model, the data is generated by the Data ac-
quisition system. This paper does not specify how to collect
data. Although asynchronous communication will be estab-
Fig. 1: Architecture of assimilation model on Multi-Agent
lished between data acquisition system and the assimilation
Simulation
model. The latter is the intermediate model, in which data
from real world is transferred to the agent-based simulation
model(virtual world).
A. Real World
The assimilation model is mainly composed of three parts:
The real world (real system) represented by the data acqui- • Data Adaptor: It is an interface connected to the data
sition system consists of a set of interactive elements. Sensors acquisition system. It takes into account data from sensors
are deployed in the system and allow real-time monitoring or databases or other data sources. This interface defines
and collection of system phenomena. Nowadays, with the how to connect data from other data sources to the DA
development of wireless sensor network technology, several model. Asynchronous communication will be favored to
methods are used in this field. The application of these allow the assimilation in order to obtain data from source
methods especially improves the optimal placement of sensor every Tdata . After collection, data is formatted and then
nodes, data transmission, and self-organization. Through the sent to the estimation model.
interconnection of sensors and the Internet, all data from • Estimation Model: This component is a predictive model
the sensors are stored on the online server (cloud) [17]. By used to estimate a series of data at time tsimul when
integrating advanced technologies (big data, machine learn- observation data is not available. It can be composed of
ing, etc.), real-time data visualization and prediction can be several prediction models or estimation models according
achieved [18]. In this architecture, real-world data comes from to the correction module. In addition, this component
sensors. Then they are stored in an online server so that they allows synchronization between simulation(tsimul ) and
can be injected into the simulation model in real time through data acquisition model(Tdata ). In order to manage the
the assimilation model. real-time aspect, when the observation data is not avail-
able, it will predict all data D’ required for the simulation
B. Virtual World
at time tsimul based on the previous observation data.
The virtual world is a real-time agent-based model. The This make it possible to react to the temporality of the
model represents the abstraction of the real world. The lat- simulation and to regularize it by injecting the appropriate
ter’s dynamics are studied through autonomous entities called data from Data Adaptor at the time of the request. This
Agent. According to Ferber [19], an Agent is “real or virtual module has three input parameters, the simulation time
autonomous entity, capable of acting on his own environment, tsimul , then new data D and the parameter E which enable
to perceive it. An Agent can communicate with another agent to improve the estimation model from Correction module
but has knowledge and offers services”. at time Tdata . The output of this module is the estimated
As indicated in the definition, agents can act on their Data D’ at time tsimul .
environment and interact with each other. This approach is • Correction : The Correction module corrects and im-
used to model complex systems. In the context of real-time proves the estimation model by reducing the gap be-
system, the agent-based approach allows decoupling of system tween the predicted data and actual data. This module will

102
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 2: MEDART-MAS: MEta-model of Data Assimilation on Real-Time Multi-Agent Simulation

run every Tdata . It takes into account new data D from In GAML, you can use the concept of skills to construct
Data Adaptor and the previous estimated data at time species in a combined manner. Skills are bundles of attributes
tsimul in order to update the parameters of the estimation and actions that can be shared between different species and
model E(e.g. it may be an error covariance matrix). This inherited by their children.
element E will be taken into account in the estimation In order to implement our proposal, we created an Assimilation
model to enhance the prediction. skill on GAMA.

IV. I MPLEMENTATION OF MEDART-MAS ON GAMA B. Implementing Meta-model


The MEDART-MAS is composed of essentially two parts:
This section describes the implementation of the meta-
The agent and the assimilation model(Fig. 3).
model on GAMA as a plug-in.
1) Agent: The agent in the multi-agent model has the
stereotype << agent >>. The RTAgent inherits from the
A. GAMA platform
agent-based model, and may have a critical task to perform
GAMA (GIS Agent-based Modeling Architecture) is for the model or have an assimilation model.
an open-source simulation platform developed since 2007. 2) Assimilation Model: It is an agent behavior with three
GAMA is a full Integrated Development Environment(IDE) elements: estimation model, Correction and data adaptor. It
based on the Eclipse IDE, and allow to quickly switch be- uses also the estimation model to obtain the estimated data
tween the modeling and simulation perspective. Gama pro- and injects it into the simulation via the RTAgent.
pose a rich interface and a simple modeling language called • Estimation Model as described above, based on the
GAML(GAMA Modeling Language). GAMA is an agent- previous data and previous prediction error, the estimation
oriented language, which mean that everything “active”(model model can predict the system state at a given time
entities, simulations ...) can be represented as an agent in interval. The estimated value is used by the assimilation
GAML, [20] describes the concept behind GAMA. The Agent model that will be injected in the simulation through the
called species in GAML provides a set: RTAgent.
• attributes (What they know), • Correction is a class used to save previous estimations in
• actions (what they can do), estimation model. Based on new data from data adaptor,
• reflex (what they can actually do). the corrector class updates the prediction parameters.

103
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 3: UML representation of MEDART-MAS

• Data Adaptor It is a class that allows to define and create addition, access to data of this type would be easier because
connectors for different data sources. The nature of these we have deployed a real-time data temperature capture station.
data sources may be different, such as databases, sensors, Every 15 minutes, data is collected from temperature sen-
etc.The data adapter class defines how to receive, collect sors deployed in Dakar(Senegal) region.
and integrate data.
A. Assimilation model
In the implementation, we create two skills based on the
To inject the temperature data into the simulation through
meta-model defined in Fig. 3.
our assimilation scheme, an appropriate assimilation model
3) Network Skill: This skill is used to create a commu-
must be defined. According to the assimilation scheme intro-
nication link between an agent and an IP-based network. It
duced above (Fig. 2) we can choose an algorithm for each
provides many ways to connect agent with a network. This
block.
skill is designed to be independent of assimilation skill. This
• Data Adaptor: For the acquisition part, to simulate the
skill use the data part (data adaptor and data source) of our
propose. We designed an abstraction layer to enable GAMA real-time communication, we have written a python script
to connect to the server through the network and using many that sends the data to an MQTT broker, and the data
protocols (eg TCP, UDP, MQTT). Adaptor connects to the broker in order to get the data.
4) Assimilation skill: There are two important actions for The data Adaptor formats data by separating arrival time
assimilation skills: and temperature data, and injects it into the simulation
through the estimation model.
• Correction: This action is called every Tdata . We allow
• Estimation and correction Model: For estimation model,
modelers to design their own correction action, so we
we use a linear regression model. We defined D as the
don’t provide an implementation of this function, but it
data from the real sensor(the output of data adaptor), and
will be executed once for each Tdata in the background.
D’ the estimated data from the estimation model(output
• Estimation: This action is called every tsimul .
of estimation model). When we use the linear estimation
In the assimilation skill, the Correction action and the model, the output of the model is:
Estimation action must be implemented on the GAML by the
modeler. In this way, a modeler can chose and implement D0 = aD + b +  (1)
different estimation and correction models or combine many where a and b represent model parameters,  estimation
models for assimilation on a multi-agent system. error. Since D and D’ have different temporality (Tdata
and tsimul ),there is a temporality problem in this equa-
V. T EST WITH S IMPLE USE C ASE : A MBIENT tion. To solve this problem, we introduce λ variable,
T EMPERATURE which represents assimilation frequency and is defined
as:
The aim of our work in this chapter is to test the model and
Tdata = λtsimul (2)
show that we can provide it with real-time data with a simple
assimilation model and have consistent results. The idea is • Simulation Model: The simulation model is represented
therefore to try to see what type of model would be easy to by an agent, which will play the role of a virtual sensor
implement for test purposes. We then chose data temperature on GAMA simulator. This agent is connected to a real
because of the simplicity of the models used for the prediction sensor (temperature sensor) to retrieve and display data
and their implementation in our simulation platform [21]. In through the assimilation model.

104
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 5: Error representation for λ variation with temperature

Fig. 4: Model fitting with temperature data


10 temperature values before the correction. This means that
the more the value of λ increases, the more the prediction
B. Model fitting error increases. Simulation series are stopped when λ = 50
Fig. 4 compares the data obtained by the assimilation model and average error is equal to 0.62◦ C.
with respect to actual measurement data set. For instance,
actual data is represented in blue, and estimated data is VI. C ONCLUSION AND F UTURE W ORK
represented in red. It should be noted that estimated data are
In this paper, we worked on data assimilation in real-
obtained by fixed λ = 2. The goal of this simulation is to
time agent-based simulation. We proposed a meta-model that
evaluate if the chosen estimation model could be adapted to
aims to use different data assimilation methods. This can
the data in order to make a prediction with less errors. For
help modelers to choose their own estimation and correction
λ = 2, we replace λ by 2 in equation 2, if Tdata = 10 seconds,
models. We think that this meta-model brings more flexibility
tsimul will be equal to 5 seconds. After simulation, an average
and will help to adapt to the different phenomena studied.
error of 0.17 degrees is obtained.
We implemented the meta-model as a toolkit on the GAMA
Looking at the estimation curve, we can see that an error has
simulator with the creation of new skills: a network skill and
occurred at the peak level. In fact, in our case, the parameters
an assimilation skill. We then tested it using ambient data
of linear regression (a and b) are evaluated by the corrector at
temperature. It is worth noticing that the choice of ambient
each time step Tdata . For this, the corrector stores the actual
temperature relies on the fact that assimilation models used for
data in a window with five values and reconstructs the linear
temperature are more easy to implement. In fact the aim of the
regression model (the values of parameters a and b). Therefore,
implementation was to test the meta-model in real conditions
when we have a peak with a large peak, the curve will be
so using simple models was a good choice for us. However
difficult to fit, but it will be adapted after several iterations
We conducted different simulations with data temperature with
(continuous correction).
several values of λ.
C. Error evaluation In the future, we plan to extend our work on geographical
We conducted several simulations by changing the fre- estimation. In fact, our work actually helps to estimate data
quency of assimilation(λ) to understand the evolution of using models in one point in depending on the time when the
prediction errors. For our tests, we set Tdata = 1 minute and simulation needs data. Our next objective is to work on cases
then change the simulation steps. After each simulation, we where sensors are geographically distributed. The challenge
calculate the average error of the prediction. The results are will be to have data estimation in places where there is no
shown in Fig. 5. sensor. After that, we will try to use some practical cases to
Fig. 5 shows mean error prediction based on different λ test the realization of the meta-model, such as real-time air
values. We can see that the prediction error increases as monitoring or real-time traffic control.
frequency of assimilation λ increases. Indeed, we use the linear
regression model, which means that all predicted values will
be taken on a straight line. If the data arrives and fits the line R EFERENCES
defined by the regression parameters, the error will not be too
[1] Soler J, Julian V, Rebollo M, Carrascosa C, Botti V. “Towards a
large, but at the peaks the error will become larger and larger. real-time multi-agent system architecture”. COAS, AAMAS. 2002;2002.
According to equation 2,for example, if λ = 10, we predict

105
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

[2] Mandel J, Bennethum LS, Chen M, Coen JL, Douglas CC, Franca LP, Smart Cities and Communities (SCCIC). IEEE, 2018. p. 1-6.
Johns CJ, Kim M, Knyazev AV, Kremens R, Kulkarni V. “Towards
a dynamic data driven application system for wildfire simulation”. In [19] Ferber, Jacques, and Gerhard Weiss. “Multi-agent systems: an
International Conference on Computational Science 2005 May 22 (pp. introduction to distributed artificial intelligence”. Vol. 1. Reading:
632-639). Springer, Berlin, Heidelberg. Addison-Wesley, 1999.

[3] Hu X. “Dynamic Data-Driven Simulation: Connecting Real-Time Data [20] Taillandier, P., Gaudou, P. Grignard, A. Huynh, Q.N., Marilleau, N.,
with Simulation”. In Concepts and Methodologies for Modeling and Caillou, P., Philippon, D., Drogoul, A.“Building, Composing and Exper-
Simulation 2015 (pp. 67-84). Springer, Cham. imenting Complex Spatial Models with the GAMA Platform”. GeoIn-
formatica, Dec. 2018. https://doi.org/10.1007/s10707-018-00339-6.
[4] Wahle J, Schreckenberg M.“A multi-agent system for on-line simulations [21] ZOUNEMAT-KERMANI, Mohammad. “Hourly predictive Leven-
based on real-world traffic data”. In System Sciences, 2001. Proceedings berg–Marquardt ANN and multi linear regression models for predicting
of the 34th Annual Hawaii International Conference on 2001 Jan 6 (pp. of dew point temperature”. Meteorology and Atmospheric Physics, 2012,
9-pp). IEEE. vol. 117, no 3-4, p. 181-192.

[5] Kosonen I.“Multi-agent fuzzy signal control based on real-time


simulation”. Transportation Research Part C: Emerging Technologies.
2003 Oct 1;11(5):389-403.

[6] Higuchi T. “Embedding reality in a numerical simulation with data


assimilation”. In 14th International Conference on Information Fusion
2011 Jul 5 (pp. 1-7). IEEE.

[7] EVENSEN G. “Data assimilation: the ensemble Kalman filter”.


Springer Science & Business Media; 2009 Aug 17.

[8] Brazier, F. M., Dunin-Keplicz, B. M., Jennings, N. R., & Treur, J.


(1995). “Formal specification of multi-agent systems: a real world case”.
In Proceedings of the First International Conference on Multiagent
Systems, June 12-14, 1995, San Francisco, California, USA.

[9] JULIAN, Vicente et BOTTI, Vicent. “Developing real-time multi-agent


systems”. Integrated Computer-Aided Engineering, 2004, vol. 11, no 2,
p. 135-149.

[10] Timóteo IJ, Araújo MR, Rossetti RJ, Oliveira EC. “TraSMAPI: An
API oriented towards Multi-Agent Systems real-time interaction with
multiple Traffic Simulators”. In Intelligent Transportation Systems
(ITSC), 2010 13th International IEEE Conference on 2010 Sep 19 (pp.
1183-1188). IEEE.

[11] DAREMA, Frederica et ROTEA, Mario. “Dynamic data-driven


applications systems”. In : Proceedings of the 2006 ACM/IEEE
conference on Supercomputing. 2006. p. 2-es.

[12] Pereira GM. “Dynamic data driven multi-agent simulation”. In


Proceedings of the 2007 IEEE/WIC/ACM International Conference on
Intelligent Agent Technology 2007 Nov 2 (pp. 76-80). IEEE Computer
Society.

[13] Suzuki S, Osogami T. “Real-time data assimilation”. In Simulation


Conference (WSC), Proceedings of the 2011 Winter 2011 Dec 11 (pp.
625-636). IEEE.

[14] Lewis, John M., Sivaramakrishnan Lakshmivarahan, and Sudarshan


Dhall. “Dynamic data assimilation: a least squares approach”. Vol. 13.
Cambridge university press, 2006.

[15] Wang M, Hu X. “Data assimilation in agent based simulation of smart


environments using particle filters”. Simulation Modelling Practice and
Theory. 2015 Aug 1;56:36-54.

[16] Rai S, Hu X. “Data assimilation with sensor-informed resampling for


building occupancy simulation”. In Proceedings of the 2017 Winter
Simulation Conference 2017 Dec 3 (p. 85). IEEE Press.

[17] Florea I, Rughinis R, Ruse L, Dragomir D. “Survey of standardized


protocols for the Internet of Things”. In 21st International Conference
on Control Systems and Computer Science (CSCS) 2017 May 29 (pp.
190-196). IEEE.

[18] NGOM, Bassirou, SEYE, Madoune Robert, DIALLO, Moussa, et al.


“A Hybrid Measurement Kit for Real-time Air Quality Monitoring
Across Senegal Cities”. In : 2018 1st International Conference on

106
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Model Checking Actor-based Cyber-Physical


Systems

Franco Cicirelli1, Libero Nigro2


1
CNR - National Research Council of Italy - Institute for High Performance Computing
and Networking (ICAR) - 87036 Rende (CS) Italy
Email: f.cicirelli@icar.cnr.it
2
DIMES - Department of Informatics Modelling Electronics and Systems Science
University of Calabria, 87036 Rende (CS) – Italy
Email: l.nigro@unical.it

Abstract—Cyber-physical systems (CPSs) integrate (sub)models respectively associated with the discrete-time
continuous behavior of a physical controlled plant with discrete cyber part and the continuous-time physical part. Moreover,
behavior provided by a controlling cyber (software) part. The CPS models tend to be analysable mostly by statistical model
integration is challenging because continuous, Newtonian time of checking [5-7] because the integration of hybrid (ODE based)
the physical part needs be reconciled with discrete time of the
cyber part. In this work, the event-based asynchronous actors of
and discrete behaviours often makes the model undecidable,
Theatre extended with continuous modes, are used for modelling thus properties can be approximated by simulation
and analyzing CPSs. Continuous modes capture the dynamic experiments.
laws (ODEs) of variation of physical/environmental variables. In this paper, the control-based Theatre actor-framework in
Theatre is control-based and distributed. It is implemented in Java [8-9], extended with continuous modes (with ODEs), is
Java, which is used both as the modelling language and as the adopted for modelling and analysis of CPS. Theatre makes it
target implementation language. Specific control forms were possible to design CPS models which are closed (the external,
developed for simulating a distributed CPS and for assessing its physical environment is explicitly modelled and integrated
functional/temporal behavior. Continuous modes exploit suitable with the models of the cyber components/actors) and have the
ODE solvers to predict the future values of selected variables at
specific time points. Although classical actors depend on non-
“engineering nature” [3], that is a model is assumed to guide a
deterministic message passing, a Theatre model can be designed faithfully system synthesis in the physical world. However,
to have a deterministic behavior. A hybrid Theatre model can be CPS models have also the “science nature” [3] because the
analyzed by exhaustive model checking by having, for instance, dynamic laws of the external variables must preliminarily be
that the computations of the ODE solvers are, preliminarily, off- predicted and captured into continuous modes.
line collected and reused during verification. This paper This paper improves previous authors’ work [7,10] which
describes Theatre, summarizes its operational semantics and was mainly based on statistical model checking [5] of CPS
illustrates a model reduction onto Uppaal timed automata. Then models. The contribution is twofold. First Java is proposed as
an automotive deterministic model based on both wired and the modelling language for Theatre-based CPS, together with a
Controller Area Network transmitted messages is presented and
thoroughly analysed.
simulation control-layer which is capable of managing, as in
[11], both wired messages and deterministic messages
Keywords—Actors, asynchronous messages, continuous modes, transmitted through a Controller Area Network [12]. CPS
cyber-physical systems, determinism, timing constraints, Controller analysis techniques are then further enhanced by a model
Area Network, model checking, statistical model checking, Java. checking approach as advocated in [4] where a Lingua Franca
synchronous model is first transformed into the terms of Timed
I. INTRODUCTION Rebeca [13] actors and then verified by the associated Afra
Concurrent and timed systems, including critical embedded model checking tool [14]. In this work, a reduction of a Theatre
real-time systems and cyber-physical systems (CPS) [1], must model with continuous modes onto the Timed Automata (TA)
be correct from the functional and the temporal point of view. of the more general and mature Uppaal model checker [15] is
Failing to fulfil the timing constraints can have severe proposed, whose practical effectiveness is improved by
consequences in the practical case. Assuring properties of a implementing mechanisms which minimize the model partial
CPS relies on preparing a formal model [2-3] of the system and order. The model checking approach assumes that the timing
possibly using model checking [4], that is assessing properties model of the physical part will not request interactions with the
over all the state trajectories of the corresponding state cyber part at arbitrary (aperiodic) unknown time points, which
transition system. However, designing a model for a CPS is would make model checking practically impossible [4].
very challenging because it must integrate multiple The rest of this paper is structured as follows. Section II
first summarizes the Theatre actor system in Java, extended
978-1-7281-7343-6/20/$31.00 ©2020 IEEE

107
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

with continuous modes; then some informal arguments to the the send operation is invoked) before after time units are
operational semantics of Theatre are furnished; finally, the elapsed. The message becomes invalid and it is discarded
realization of a control-layer for modelling and simulation is would the current time become greater than the message
discussed. Section III describes an automotive CPS deadline. When missing, an after evaluates to 0, and a deadline
deterministic case study. The model is first developed and defaults to .
analysed in Java. Then a reduction onto the timed automata of A msgsrv can use a delay(d) operation to express the
Uppaal is detailed which enables model checking. Different duration of a code fragment. Since Theatre actors are non-
techniques are presented to minimize the model partial order. suspensive, if used, a delay should be the last operation of a
Finally, conclusions are drawn with an indication of further msgsrv. The effect of delay(d) is to occupy, for d time units,
work. the PU upon which the actor is logically executing.
Theatre actors prove effective for modelling the time
II. THEATRE CONCEPTS IN JAVA discrete cyber part of a CPS [7,10]. Challenging is the
Theatre [16,8-9,7] is a variant of the Actors computational modelling of the time-continuous physical part, that is the
model [17], that addresses all the development phases of external controlled environment. As in Hybrid Rebeca [11],
distributed timed systems. Theatre rests on global time, extended Theatre admits continuous modes, which are special
lightweight (that is thread-less) actors, and an asynchronous “physical actors” interfacing the cyber part with the external
message passing governed by a reflective control-layer, which environment. A continuous mode (see Fig. 2), as in Hybrid
can be customized. The following considers an extension of Automata [19], logically consists of an initialization, an
Theatre with hybrid aspects, designed for modelling and invariant, one or more flows (first order ODEs), a guard and a
analysis in Java of time-critical CPS. A CPS model is a final action. A continuous mode is activated by a cyber actor
federation of interconnected theatres (computing nodes). Each (environment accessor), which at the mode termination (the
theatre hosts a collection of application actors. Actors’ guard evaluates to true) receives (final action) an instantaneous
universal naming is directly based on Java object references. message with the computed values of continuous variables.
An actor is at rest until a message arrives. Processing (i.e., Such final messages allow to integrate the continuous time
reacting to) a message is an atomic activity which cannot be physical part behavior with that of the discrete time cyber part.
suspended nor preempted (macro-step semantics [18]). Only at Each continuous mode owns an implicit and hidden PU.
the termination of the current message reaction, the event-loop
of the control layer resumes its execution and selects and
delivers a next message for processing and so forth. Message
interleaving ensures a cooperative concurrency scheme among
the actors of a theatre.
For modelling purposes, theatres are abstracted as
processing units (PUs) each having a unique id. An actor is
allocated to a PU using a move() operation (see Fig. 1). A PU
can be free or busy. It is not possible to dispatch a message to
an actor allocated on a currently busy PU. A PU becomes busy
when one of its actors is delayed (see below).
Fig. 2. Continuous vs. discrete time integration
send( String msg[, Object…args] );
send( double after, String msg[, Object…args] ); Informal arguments to Theatre semantics
send( double after, double deadline, String msg[, Object…args] ); The semantics of a Theatre model can be given operationally
double now(); [8] by a Timed Transition System (TTS) (S,s0,→) where S is a
void delay( double duration ); set of states, s0 is an initial state and → is the transition
void move( theatre-id ); relation. A state is composed of:
Fig. 1. Basic Actor services • All the actor internal states (E).
• The current value of global time (now).
A programmer-defined actor class derives from the Actor • The set of sent but not yet dispatched messages (M).
base class which exposes some fundamental services (see Fig. • The set of activated but not yet expired delays (D).
1 and [9]) whose concretization depends on the adopted control The temporal information of scheduled messages or set
layer. An actor class (see for example Fig. 5) declares an
delays are assumed to be absolutized at the send/set time:
encapsulated data status which includes some acquaintances,
i.e., actor names to whom messages can be sent (for pro- after+now, deadline+now, delay+now.
activity, an actor can always send a message to itself), plus an A transition in the TTS can be a time advancement
interface of message servers (msgsrv) [13]. A msgsrv is a transition, or the occurrence of a most imminent event
method which always returns void and can have arguments. A (message dispatch or delay expiration). Time advancement
msgsrv processes a message with the same name. The basic increases now to the time of the (or one of the) most imminent
non-blocking send relies on Java reflection and specifies, event.
besides the msg name and its optional arguments, two time A message can be dispatched to its target actor provided its
attributes: an after and a deadline [13]. Both quantities are PU is free and its deadline is not exceeded. Message dispatch
relative to the current time (now()). The message can’t be causes a msgsrv to be atomically executed. A delay expiration
delivered to its destination (that is, the actor object upon which makes a PU again free. When multiple events are eligible to

108
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

occur at the same time, one of them is chosen non- the brake pedal are estimated (sampled) through simple ODEs
deterministically and occurs. Event non-determinism, though, respectively by the Rolling and Braking continuous modes,
can be controlled by attaching e.g. a Lamport logical clock and transmitted respectively to the wheels and then to their
(meta data) to messages, so as to dispatch simultaneous wheel controllers (WCtrl) and to the brake (Brake) and then to
messages according to their logical clock, thus restoring the brake controller (BrakeCtrl). Wheel controllers in turn
(almost) the sending order. Activated continuous modes transmit the wheel speeds to the brake controller. The brake
operate in parallel. However, their behaviour can be abstracted controller acts as the main controller of the model. When it
by final messages sent at current time to the accessor actors. has both the wheel speeds and the new bprcnt, it applies the
An automotive-based control layer control actions of the current period. It estimates the
A CPSSimulator control layer was developed to enable longitudinal vehicle speed (vspd) and sends to the wheel
modelling and analysis in Java of Theatre-based CPS models, controllers the vspd and the new value of brpcnt (assumed
e.g. belonging to the automotive domain. The control layer equal to the torque level to be applied to the wheels). From the
recognizes wired messages and Controller Area Network vspd and the wheel speed, each wheel controller evaluates the
(CAN) [12] transmitted messages. By default, messages are slip rate (slprt) and (possibly) immediately releases the
assumed to be wired and can have the usual after and deadline braking, for safety reasons, would the slip rate be greater than
timing attributes. In addition, wired messages are equipped 0.2. The BBW model evolves towards the vehicle coming to a
with a Lamport logical clock (LC). This way, messages with complete stop (wheel speed <=0). The bprcnt is increased
the same timestamp are ordered by their LC. CAN transmitted from an initial value of 0.60 to a maximum value of 0.85. The
messages must be explicitly declared through the method (see initial value of the wheel speed is 1.
also Fig. 12):

void can( Actor sender,Actor receiver,String msg,int priority,double pDelay );

which specifies that message msg of receiver, invoked by


sender, has an assigned (unique) priority and propagation time
pDelay. Managing a CAN bus requires guaranteeing that all
the messages which at a given time require to be transmitted on
the bus, be put (scheduled) on the CAN buffer with only the
highest priority message which is allowed to occupy the bus.
After that, the next high priority message in the buffer is
considered for transmission and so forth. Of course, the
propagation delay is a relative time. Only when the highest
Fig. 3. Actor structure of the BBW model
priority message is considered for delivery, its propagation
delay gets absolutized for it to be compared with wired Determinacy of the model is ensured by guaranteeing,
message timestamps and delay expiration times. The following despite the nondeterminism of arrivals of the sample messages
order is ensured by the CTSSimulator on simultaneous events: from the rolling modes to wheels and from the braking mode
a delay expiration precedes a wired message dispatch which to Brake (pedal), the brake controller only reacts after all the
precedes a CAN-based message dispatch. peripheral device information have been received.
A CPSSimulator finishes after a given time limit is
exceeded. However, an end() method (also exposed through public final class G {
the Actor class) can be used to prematurely terminate a private G() {}
simulation experiment. public static double WRAD=0.45; //wheel radius
public static double PERIOD=0.05;
public static double SRT=0.2; //slip rate threshold
III. A BBW CASE STUDY public static int WN=2; //number of wheels
public static double TEND=2.0D;
A simple automotive-based Brake-By-Wire (BBW) with Anti- public static int N=6; //number of theatres/PUs
lock Braking System (ABS) model is considered, which was public static final int WR=0, WL=1, WCR=0, WCL=1;
adapted from [11]. In the BBW, braking is achieved not by …
mechanical parts but by sensor and actuators. Moreover, when }//G
Fig. 4. Scenario parameters of the BBW model
a slip-rate is sensed on some wheel which is greater than a
given threshold, the braking system is immediately released to
avoid the risk of skidding. The Theatre-based actor model is Wheel controller and brake controller exchange messages
summarized in Fig. 3, where rounded boxes denote continuous through a CAN bus. The Main class (see Fig. 11) configures
modes. For simplicity, only two wheels are considered, named the model by creating and initializing the actors and by
left and right wheel. establishing priority (lower value is higher priority) and
The proposed BBW model significantly differs from [11] propagation delay (0.01sec) of each CAN transmitted
because it is deterministic. The model is basically periodic, message. For demonstration purposes, in Fig. 11 each actor is
with a period of 0.05sec. At each period, the new (angular) allocated to a different theatre/PU. Other partitioning could be
speed of the wheels and new braking percentage (bprcnt) of considered as well. The Java BBW model is reported from

109
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 4 to Fig. 11. Some global parameters of the considered public class BrakeCtrl extends Actor{
scenario are collected in the G class which is statically private WCtrl wcR, wcL;
private double wspdR, wspdL, bprcnt, vspd;
imported in the actor classes. private int c=0;
@Msgsrv public void init( WCtrl wcR, WCtrl wcL ) {
public class Wheel extends Actor{
this.wcR=wcR; this.wcL=wcL;
//acquaintances
}//init
private WCtrl ctrl;
@Msgsrv public void control() {
private Rolling rolling;
//vehicle horizontal estimation speed
//local data variables
vspd=((wspdR+wspdL)*WRAD)/2.0;
private double trq, spd;
wcR.send( "applyTrq", bprcnt, vspd );
private int id;
wcL.send( "applyTrq", bprcnt, vspd );
@Msgsrv
}//control
public void init( Integer id, WCtrl ctrl, Rolling rolling, Double spd ) {
@Msgsrv public void setWspd( Integer id, Double wspd ) {
this.id=id; this.ctrl=ctrl; this.rolling=rolling; this.spd=spd;
if( id==WCR ) wspdR=wspd;
rolling.send( "activate", this, spd, trq );
else wspdL=wspd;
}//init
c++;
@Msgsrv public void setTrq( Double tq ) { trq=tq; }//setTrq
if( c==WN ) { send( "control" ); c=0; }
@Msgsrv public void sample( Double sp ) {
}//setWspd
spd=sp; //angular wheel speed
@Msgsrv public void setBprcnt( Double bprcnt ) { this.bprcnt=bprcnt; }
if( spd<=0.0D && id==WL ) {
}//BrakeCtrl
print current time, the spd and the maxEED estimated values
Fig. 8. The BrakeCtrl actor class
end();
} public class Rolling extends Mode{
ctrl.send( "setWspd", spd );
//estimates angular speed of a wheel
rolling.send( "activate", this, spd, trq );
double spd, trq;
}//sample @Msgsrv
}//Wheel
public void activate( Wheel wheel, Double spd0, Double tq ) {
Fig. 5. The Wheel actor class
trq=tq;
//spd'=-0.1-trq
public class WCtrl extends Actor{ spd=RollingODE.solve( PERIOD, spd0, trq );
private Wheel wheel; wheel.send( PERIOD, "sample", spd );
private BrakeCtrl bctrl; }//activate
private int id; }//Rolling
private double wspd, slprt; Fig. 9. The Rolling mode class
@Msgsrv
public void init( Integer id, Wheel wheel, BrakeCtrl bctrl ) { public class Braking extends Mode{
this.id=id; this.wheel=wheel; this.bctrl=bctrl; double bprcnt;
}//init @Msgsrv
@Msgsrv public void setWspd( Double spd ) { public void activate( Brake br, Double bprcnt0, Double r ) {
wspd=spd; bctrl.send( "setWspd", id, wspd ); //bprcnt'=r;
}//setWspd bprcnt=BrakingODE.solve( PERIOD, bprcnt0, r );
@Msgsrv public void applyTrq( Double reqTrq, Double vspd ) { br.send( PERIOD, "sample", bprcnt );
if( vspd<=0.0D ) slprt=0.0D; }//activate
else slprt=(vspd-wspd*WRAD)/vspd; }//Braking
if( slprt>SRT ) wheel.send( "setTrq", 0.0D ); Fig. 10. The Braking mode class
else wheel.send( "setTrq", reqTrq );
}//applyTrq public class Main {
}//WCtrl public static void main( String... args ){
Fig. 6. The WCtrl actor class CPSSimulator cm=new CPSSimulator( N, TEND );
Wheel wr=new Wheel(), wl=new Wheel();
public class Brake extends Actor{ WCtrl wcr=new WCtrl(), wcl=new WCtrl();
private BrakeCtrl bc; Brake bp=new Brake(); BrakeCtrl bc=new BrakeCtrl();
private double bprcnt, maxprcnt, r; Rolling ror=new Rolling(), rol=new Rolling();
private Braking braking; Braking braking=new Braking();
@Msgsrv wr.send( "init", WR, wcr, ror, 1.0 ); wl.send( "init", WL, wcl, rol, 1.0 );
public void init( BrakeCtrl bc, Braking brk, Double bprcnt, Double maxprc ) { wcr.send( "init", WCR, wr, bc ); wcl.send( "init", WCL, wl, bc );
this.bc=bc; this.braking=brk; this.bprcnt=bprcnt; this.maxprcnt=maxprc; bp.send( "init", bc, braking, 0.60, 0.85 ); bc.send( "init", wcr, wcl );
r=1; braking.send( "activate", this, bprcnt, r ); cm.can( bc, wcr, "applyTrq", 1, 0.01 ); //CAN declaration
}//init cm.can( bc, wcl, "applyTrq", 2, 0.01 );
@Msgsrv public void sample( Double bp ) { cm.can( wcr, bc, "setWspd", 3, 0.01 );
bprcnt=bp; bc.send( "setBprcnt", bprcnt ); cm.can( wcl, bc, "setWspd", 4, 0.01 );
if( bprcnt>=maxprcnt ) r=0; //actor partitioning – maximal parallelism as an example
braking.send( "activate", this, bprcnt, r ); wr.move( 0 ); wl.move( 1 ); wcr.move( 2 ); wcl.move( 3 );
}//sample bp.move( 4 ); bc.move( 5 );
}//Brake cm.controller(); //launches the control event-loop
Fig. 7. The Brake (pedal) actor class }//Main
Fig. 11. The Main class

110
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

It is worth noting that the ODE classes of continuous modes instances have a unique identifier established by a
(RollingODE and BrakingODE not shown for brevity) were corresponding sub-range type (typedef). The set of pending
implemented by using the Apache commons math3 library. scheduled messages, set delays and activated modes are
For simulation purposes, some further variables were added represented by corresponding entity instances with their
to the BBW actor model so as to decorate it, e.g., to estimate timing constraints for firing. The organization implicitly relies
the maximum end-to-end delay (maxEED) between the on Uppaal nondeterminism for choosing the entity who may
moment a new sample bprcnt value is generated, and the time fire next. Broadcast channels are used for scheduling a
the bprcnt is applied to the wheels. Similarly, the speed values message, activating a mode, setting a delay, proposing a
and torque values (bprcnt) were collected and stored to a file message for transmission over a CAN bus, and for dispatching
for offline analysis. Any simulation run terminates with the a message to a receiver actor. The use of broadcast channels is
output: Vehicle stopped @1.20sec spd=-0.03 maxEED=0.04sec, that is the a key for transforming a TA model for use also with the
vehicle is stopped after 24 periods and the maximum observed statistical model checker of Uppaal.
EED is 0.04sec, thus within the period. Fig. 12 shows the
observed shape of the angular (wspd) and horizontal (vspd) Actor automata. The template process of an actor directly
speeds and torque level (trq) vs. time. corresponds to its high-level Java model. The automaton is
organized into two main locations: Receive and Select. In the
Receive (home) location, the next message to process is
awaited. In the Select location, the particular received message
ID is checked and the corresponding “msgsrv body” (reaction)
executed. To comply with the macro-step semantics (see
section II), each msgsrv body is realized as a cascade of
committed locations, thus guaranteeing the atomic execution.
At the end of a msgsrv, the Receive location is re-entered.
Some actor TA of the BBW model are shown in the figures
from 14 to 17 (the simple Main automaton is not reported for
brevity). A 103 scale factor is used for double variables (wheel
speeds, bprcnt/torque level, time granularity etc.). Now one
period becomes 50ms. The wheel radius (see the WRAD
Fig. 12. Wheel and vehicle speeds and torque level vs. time constant in Fig. 4) is scaled, instead, by 102. The use of integer
arithmetic can be observed in Fig. 14 and Fig. 16.
spd<=0
Model checking the BBW model
The simulation results of the BBW model give an important
spd>0 activate[rolling]! A
indication about its quantitative behavior, also considering that D=self,M=SAMPLE
the model is deterministic. However, the possibility of making send[mi]!
an exhaustive verification based on model checking [4] would
allow a more in-depth analysis of the model properties. msg==SET_TRQ
trq=arg[0]
A CPS model like the BBW could be analyzed by statistical msg==SAMPLE

model checking [5-6], e.g., through a reduction onto the msgsrv[self]?


mi=nM(self,ctlr,SET_WSPD),
spd=arg[0]
hybrid automata of Uppaal [7,10,20]. The use of continuous Receive msg=M Select
modes with ODEs and of double variables, though, forbids the
activate[rolling]!
use of the Uppaal symbolic model checker [15]. Nonetheless, D=self,M=SAMPLE
msg==INIT

the Java BBW model can be reduced to Uppaal Timed


Automata (TA) using the transformation rules proposed in [8], spd=arg[0],ctlr=arg[1],rolling=arg[2]

and by converting double values into integer values by a scale Fig. 13. The automaton of the Wheel actor
factor, with a corresponding approximated integer arithmetic. A critical aspect in TA actor modelling, is the dynamic
Most importantly, though, is a replacement of the continuous message exchanges. The Uppaal model checker requires to
behavior. The ODE solvers’ results at each time (period), can work with a statically defined number of entity instances. As a
be pre-computed from the Java program, and collected into consequence, a pool of (timed or immediate) wired message
constant arrays of the Uppaal model. This way, at each period, instances, a pool of CAN message instances and a pool of
a continuous mode simply accesses the corresponding ODE delays etc. are used.
value and transmits it to the accessor actor by a message. The Sending a message is realized by first achieving a message
behavior closely mimics that of modes kept into the final instance id through the nM() function, which checks if the
system implementation that, at each period, access the value message is wired of CAN-based. Then filling some arguments
of selected external environment variables, by exploiting an in the global array arg[] and, finally, by raising an output
abstraction like the envGateway proposed in [16,7]. operation (!) on the send[.] channel indexed by the message
First of all, all modeled entities (actors, messages, modes instance id. In a similar way a mode instance can be activated
and delays) are mapped onto TA (template processes) whose etc.

111
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

send[mi]! slprt>SLT location in Fig. 17, when eligible, without delay. The more
mi=nM(self,w,SET_TRQ),
general automaton in Fig. 18 controls that the message
arg[0]=0
send[mi]! slprt<=SLT
deadline is not exceeded. A discarded message instance is
mi=nM(self,w,SET_TRQ), simply returned to its pool. A dispatchable timed message is
arg[0]=reqTrq eventually delivered through an immediate wired message (see
send[mi]! msg==SET_WSPD vspd>0 vspd<=0 later for a discussion).
slprt=((vspd-(wspd*WRAD)/ slprt=0
wspd=arg[0],arg[1]=id, 100)*100)/vspd
mi=nM(self,bctrl,SET_WSPD) free(mi) deadline_miss

msgsrv[self]? msg==APPLY_TRQ deadline!=INF && x>deadline


scheduled
Receive msg=M Select reqTrq=arg[0],vspd=arg[1] idle
send[tm]? x>=after avail[pu[dest]]
dest=D,msg=M,after=AFTER, check!
x<=after delivery dispatch
msg==INIT deadline=DEADLINE,getParams(),
w=arg[0],bctrl=arg[1],id=arg[2] x=0
Fig. 14. The automaton of the WCtrl actor send[mi]! deadline==INF || x<=deadline
free(mi) mi=nM(dest,dest,msg),
putParams()
Wired message automata. The two used templates are shown
Fig. 18. The automaton of TimedWiredMessage
in Fig. 17 and Fig. 18. A wired and immediate message has to
be consigned at current time, provided the theatre/PU to which
CAN message automata. The more complex template process
the receiver actor belongs, is not busy.
in Fig. 19 ensures all simultaneous CAN submissions are kept
bprcnt>=mxprcnt send[mi]!
in the buffer location, whereas only the highest priority one is
r=0 allowed to proceed on the bus according to its propagation
delay.
bprcnt<mxprcnt send[mi]! Also the automaton in Fig. 19 eventually generates an
msg==SAMPLE
immediate wired message for its actual dispatch.
activate[braking]!
bprcnt=arg[0],
D=self,M=SAMPLE
mi=nM(self,bctrl,SET_BPRCNT),
x==0 && !hPrio(cs)
x=0
msgsrv[self]?
canBusy=false
Receive msg=M Select
bprcnt=arg[0], idle buffer !canBusy && hPrio(cs) scheduled dispatch

msg==INIT
mxprcnt=arg[1], send[cm]? check! x>=pdelay
braking=arg[2], dest=D,msg=M, canBusy=true,x=0,
bctrl=arg[3],r=1
x<=pdelay
activate[braking]! cs=req(S,D,M), pdelay=pTime(cs)
D=self,M=SAMPLE getParams()
Fig. 15. The automaton of the Brake actor send[mi]!
send[mi]! canBusy=false,free(cm),rel(cs) mi=nM(dest,dest,msg),putParams()
mi=nM(self,wctlrL,APPLY_TRQ),
mi=nM(self,wctlrR,APPLY_TRQ), Fig. 19. The automaton of CanMessage
arg[0]=bprcnt,
send[mi]! arg[0]=bprcnt,arg[1]=espd
arg[1]=espd
Continuous mode automata. In a statistical model checking
msgsrv[self]? msg==CONTROL model, a continuous mode like the Rolling one in the BBW
Receive msg=M Select espd=(((wspdR+wspdL)/2)* model, could be naturally expressed as a hybrid automaton as
WRAD)/100
send[mi]!
msg==INIT in Fig. 20.
wctlrR=arg[0],wctlrL=arg[1] Obviously, the automaton in Fig. 20 can’t be used in a TA
c==WN msg==SET_WSPD model for the presence of the ODE flows and the use of double
mi=nM(self,self,CONTROL),c=0 wspdL=(arg[1]==wctlrL)?arg[0]:wspdL, variables. Fig. 21 shows a corresponding automaton which
c<WN wspdR=(arg[1]==wctlrR)?arg[0]:wspdR,
simply reads on the pre-computed ODE solver values. A
c++
msg==SET_BPRCNT similar automaton is used for the Braking mode.
bprcnt=arg[0]
Fig. 16. The automaton of the BrakeCtrl actor
idle scheduled dispatch
send[wm]? avail[pu[dest]] mi=nM(dest,dest,msg),
dest=D,msg=M, check! darg[0]=spd
send[mi]!
getParams() Flow t>=PERIOD

msgsrv[dest]! Idle activate[rid]? check! End

M=msg,putParams() dest=D,msg=M,
spd'==0 t<=PERIOD &&
Fig. 17. The automaton of ImmediateWiredMessage spd=darg[0],
t'==1 &&
trq=darg[1],
spd'==-0.1-trq
The check broadcast and urgent channel is fictitiously sent t=0
to force the passage from, e.g., the scheduled to the dispatch Fig. 20. An hybrid automaton for the Rolling continuous mode

112
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

transferred to a delayed bag, from which it will be delivered as


soon as its PU becomes free. All these actions are carried out
mi=nM(dest,dest,msg),
by the function get(…).
arg[0]=nextSPD(rid)
send[mi]! To give an idea of the effects of the FIFO ordering on the
Flow partial order, the query that checks for the absence of
Idle activate[rid]? t>=PERIOD End deadlocks now terminates in about 9sec.
dest=D,msg=M,
t<=PERIOD Another source of partial order in the BBW model is
t=0 concerned with the three simultaneous sample timed messages
Fig. 21. A timed automaton for the Rolling “continuous” mode generated by the continuous modes. To order the three timed
messages, a precedence constraint was introduced which
System configuration. The complete TA BBW model is forces the sample messages to occur in a fixed order: first the
configured as shown in the following: sample for the right wheel, then that for the left wheel, finally
that for the brake (pedal). In the case the vehicle is stopped, the
system ImmediateWiredMessage, CanMessage, Rolling, Braking, Wheel,
WCtrl, Brake, BrakeCtlr, Main; precedence constraint always allows the sample for the brake.
To figure out the combined effects of FIFO ordering and
where the use of a single parameter of the relevant subrange the precedence constraint, the query checking for the absence
type guarantees the necessary number of instances of each kind of deadlocks now ends in 0.453sec. The resultant TA BBW
of automaton are initially created. model was then property checked exhaustively by the Uppaal
model checker.
Nondeterminism and partial order. The achieved TA BBW
model has a significant amount of nondeterminism due to the Model checking the BBW TA model. Obviously, a control
presence of different simultaneously enabled outgoing layer like CPSSimulator or a statistical model checker cannot
transitions. This implies that partial order exists in many states answer to a query which asks if the model is without
of the model state graph (timed transition system). Since the deadlocks, simply because they are based on simulation. A
BBW model is deterministic “by design”, such a partial order simulation corresponds to a particular execution path on the
affects only the performance of the model checker. For complete state graph of the model. Although with approximate
example, the query checking for the absence of deadlocks ends arithmetic and integer variables, the final TA BBW model was
in about 100sec on an ASUS ZenBook Win10 64bit, Intel Core model checked with the TCTL [15] queries reported in Table 1.
i7-8565U CPU@1.80GHz, 16GB.
Table 1. Model checking queries for the TA BBW model
Reducing the partial order is a complex activity on many # Query Result
TA models. For Theatre models it could be important to ensure 1 A[] !deadlock satisfied
that message dispatch occurs in the sending order, as in the 2 A<> forall(w:wheel_id)Wheel(w).spd<=0 satisfied
Java implementation. A mechanism was devised to control the 3 A[] WCtrl(WCR).slprt+WCtrl(WCL).slprt==0 satisfied
message delivery order. It is recalled that in the Uppaal TA 4 A[] (exists(w:wheel_id)Wheel(w).Select && satisfied
msg==SET_TRQ) imply x<=40
model, all the messages become ultimately immediate wired 5 A[] (exists(w:wheel_id)Wheel(w).Select && not satisfied
messages. This way, ordering the immediate wired messages msg==SET_TRQ) imply x<=39
can improve the efficiency of the model checker. The pool of 6 A[] (exists(w:wheel_id)Wheel(w).A && satisfied
immediate wired message instances was turned into a First Wheel(w).spd<=0) imply y<=1200
Input First Output (FIFO) queue. When a new message 7 A[] (exists(w:wheel_id)Wheel(w).A && not satisfied
Wheel(w).spd<=0) imply y<=1199
instance is requested by the nM(…) function, the available slot 8 A[] canSubmissions()<=2 satisfied
in the rear of the pool is returned. When an immediate wired 9 A[] exists(cm1:cmi)CanMessage(cm1).scheduled satisfied
message is in the position to be dispatched, only the instance at imply (exists(cm2:cmi)cm1!=cm2 &&
the head of the pool is allowed to conclude its delivery. CanMessage(cm2).buffer &&
CanMessage(cm1).msg==CanMessage(cm2).msg &&
idle scheduled dispatch hPrio(CanMessage(cm1).cs))
send[wm]? first(wm,dest)
dest=D,msg=M, check! get(wm,dest) The positive answer to query 1 guarantees the BBW is without
getParams() deadlocks in any state. This can appear a little strange because
msgsrv[dest]! when the vehicle is stopped, the rolling modes no longer
M=msg,putParams() evaluate a next value for the wheel speed and thus no further
Fig. 22. The automaton of ImmediateWiredMessage with FIFO delivery order control is generated by the brake controller. In reality, even
when wheels are stopped, the braking mode continues
The new template process in Fig. 22 was adopted for an (although unnecessarily) to generate a sample for the Brake
immediate wired message. Function first(wm,dest) returns true with the reached maximum bprcnt.
if the message instance is at the head of the queue and its Query 2 says that starting from the initial state, the model
receiver actor belongs to a free PU. In the case the message is inevitably moves to a final state where the wheels are stopped.
at the head but its PU is busy, to preserve the FIFO ordering, In other terms, it is guaranteed that the vehicle is eventually
the delayed message instance is extracted from the pool and stopped.

113
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Since the model is “ideal” and no road condition (oil or REFERENCES


similar) is considered which would create a slip rate, query 3 [1] E.A. Lee, S.A. Seshia. Introduction to Embedded Systems-A Cyber-
states that effectively the slip rate is always kept to 0. physical systems approach. 2nd Edition, 2017.
Query 4 is related to assessing the end-to-end delay from [2] E.A. Lee, M. Sirjani. What good are models? In Int. Conf. on Formal
the instant in time when a new bprcnt is generated by the Aspects of Component Software. Springer, Cham, 2018.
Braking mode, and the time when the bprcnt/torque level is [3] E.A. Lee. Models in Engineering and in Science. Communications of the
ACM, 62(1):35-36, 2019.
actuated on the wheels. A decoration clock x is used which is
[4] M. Sirjani, E. Khamespanah, E.A. Lee. Model checking software in
reset when a new SAMPLE message is received by Brake, and
cyberphysical systems. In Proc. of IEEE Computer Software and
it is checked when a SET_TRQ message is received by a Application Conference (COMPSAC 2020), Madrid, Spain, 2020.
wheel. The max EED is 40msec or (considering the scale factor [5] G. Agha, K. Palmskog. A survey of statistical model checking. ACM
of 103) 0.04sec. Trans. Model. Comput. Simul. 28(1):6:1-6:39. URL http://doi.acm.org/
The not satisfaction of the query 5, guarantees that 10.1145/3158668, 2018.
effectively 40 is the max EED. [6] E.M. Clarke, P. Zuliani. Statistical model checking for cyber-physical
Although query 2 ensures the vehicle comes to a complete systems. ATVA 2011, LNCS 6996, Springer, 2011.
stop, queries 6 and 7 check the maximum time which elapses [7] F. Cicirelli, L. Nigro. Home energy management using Theatre with
hybrid actors. In Proc. of ACM/IEEE Symp. on Distributed Simulation
before stopping. A second decoration clock y is used which is and Real Time Applications (DSRT19), October, Cosenza, Italy, 2019.
reset initially and checked when any wheel registers a speed [8] L. Nigro, P. F. Sciammarella. Qualitative and quantitative model
value <=0 (see also Fig. 13). checking of distributed probabilistic timed actors. Simulation Modelling
Finally, queries 8 and 9 check the correct behaviour of the Practice and Theory, doi 10.1016/j.simpat.2018.07.011, 87, pp. 343-
CAN bus (see Fig. 20). In the BBW model, either speed values 368, 2018.
are transmitted by wheel controllers to the brake controller [9] F. Cicirelli, L. Nigro, P.F. Sciammarella, Seamless development in Java
of distributed real-time systems using actors. Int. J. Simulation and
(SET_WSPD messages) or a new torque level is sent by the Process Modelling, Vol. 15, Nos. ½, pp. 13-29, 2020.
brake controller to the wheel controllers (APPLY_TRQ [10] L. Nigro, P.F. Sciammarella. Statistical model checking of cyber-
messages). In any case no more than two CAN submissions physical systems using hybrid Theatre. Advances in Intelligent Systems
can co-exist (query 8). Query 9 says that when two CAN and Computing, Springer, https://doi.org/10.1007/978-3-030-29516-5,
messages are submitted, they are necessarily of the same kind, pp. 1232-1251.
one is in the scheduled location (see Fig. 19) and the other is in [11] I. Jahandideh, F. Ghassemi, M. Sirjani. Hybrid Rebeca: Modeling and
analyzing of cyber-physical systems. Model-based Design of Cyber-
buffer location. Then, the progressing submission always Physical Systems, January. arXiv preprint arXiv:1901.02597, 2019.
relates to the highest priority message. [12] K. Lawrenz. CAN System Engineering. Springer, 2013.
Model checking confirms the results of the preliminary [13] A. Jafari, E. Khamespanah, M. Sirjani, H. Hermanns, M. Cimini.
simulation carried in Java, and provides more arguments about PTRebeca: Modeling and analysis of distributed and asynchronous
correctness of the BBW model behaviour. systems. Sci. Comput. Program. Vol. 128, pp. 22-50, 2016.
[14] Rebeca modelling language, on-line: http://rebeca-lang.org/.
IV. CONCLUSIONS [15] G. Behrmann, A. David, K.G. Larsen. A tutorial on UPPAAL. In Formal
Methods for the Design of Real-Time Systems, M. Bernardo and F.
The work described in this paper shows that the Theatre actor Corradini Eds., Lecture Notes in Computer Science, Vol. 3185,
system [8-9], extended with continous modes [7,10], proves to Springer-Verlag, pp. 200-236, 2004.
be effective in the modelling and analysis of cyber-physical [16] F. Cicirelli, L. Nigro, P. F. Sciammarella. Model continuity in cyber-
systems (CPS). Both a Java-based simulation approach and a physical systems: A control-centred methodology based on agents,
Simulation Modelling Practice and Theory, 83(4):93-107, 2018.
reduction to Uppaal [15] timed automata for exhaustive model
[17] G. Agha. Actors: a model of concurrent computation in distributed
checking were developed. The model checking approach is systems. MIT Press, Cambridge, MA, USA, 1986.
improved by implementing mechanisms for reducing the [18] R.K. Karmani, G. Agha. Actors. Springer US, Boston, MA, pp. 1–11,
model partial order. 10.1007/978-0-387-09766-4_125, 2011.
Reported work represents one step toward the achievement [19] T.A. Hensinger. The theory of hybrid automata. In Verification of
of a deterministic version of Theatre [21]. Deterministic Digital and Hybrid Systems. Springer, Berlin, Heidelberg, pp. 265-292,
Theatre would favor the design, analysis and synthesis of 2000.
reproducible CPS models, by ensuring a pre-defined message [20] A. David, K.G. Larsen, A. Legay, M. Mikucionis, D.B. Poulsen.
UPPAAL SMC Tutorial. Int. J. on Software Tools for Technology
delivery order which, as in [4], can rely on unique ids Transfer, Springer, 17:1-19, 06.01.2015, DOI 10.1007/s10009-014-
(priority) attached to actors and messages. This way, messages 0361-y, 2015.
will first be delivered in the timestamp order. Simultaneous [21] M. Lohstroh, E.A. Lee. Deterministic actors. Forum on Specification
messages, though, will be delivered according to the priority and Design Languages, Southampton, UK, 2019.
of the receiver actors or the priority of sent messages to a [22] L. Nigro. Modelling and analysis of cyber-physical systems using
same actor. Preliminary experience is described in [22]. Deterministic Theatre. IEEE WorldS4, London, 27-28 July, 2020

114
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Performance Evaluation of
HLA RTI Implementations
Moritz Gütlein, Wojciech Baron, Christopher Renner, Anatoli Djanatliev
Computer Networks and Communication Systems, Dept. of Computer Science
University of Erlangen-Nürnberg
Erlangen, Germany
{moritz.guetlein,wojciech.baron,chris.renner,anatoli.djanatliev}@fau.de

Abstract—The High Level Architecture is an IEEE standard been thoroughly tested and is therefore more trustworthy than
that enables distributed simulation. There are several implemen- proprietary self-crafted and possibly error-prone solutions.
tations of the standard, which are providing i.a. the Run-Time One of these middleware approaches for DS is the High
Infrastructure component. This paper compares the four most
known RTI Implementations, namely MAK RTI, Pitch pRTI, Level Architecture (HLA). The development of HLA started
Portico, and CERTI, with a focus on performance evaluation. In in the early 1990s by the US Department of Defense, which led
general, Pitch pRTI was the fastest implementation for most of to the major release of HLA 1.3 in 1998. In the late 1990s the
our experiments. CERTI performed best for big payload sizes steering wheel was handed over to IEEE, which resulted in the
and Portico showed an interesting oscillation pattern. first international standard (IEEE 1516-2000) in 2000. In 2010,
Index Terms—High Level Architecture, HLA, Distributed Sim-
ulation, Performance, Middleware, RTI HLA Evolved (IEEE 1516-2010) followed and is currently the
most recent version. At the moment, a subsequent release is
being developed. The HLA is defining a set of services to be
I. I NTRODUCTION
provided, while the underlying communication layer is left up
Distributed Simulation (DS) is associated with many ad- to the middleware implementation.
vantages such as overcoming of memory limits, performance The performance of the middleware implementation is cru-
gains, and fault tolerance. The techniques can be applied to cial for the performance of the entire distributed simulation
build co-simulations, by interconnecting different simulators setup. Having the freedom to use and implement different
running on various systems, or to couple simulators with approaches for the lower communication layer, the comparison
real-world components. The latter can be used for instance of different HLA middleware implementations is of interest.
to perform Hardware-In-the-Loop (HIL) tests of electronic This freedom is not only about the wire protocol itself, but also
control units (ECUs) in the vehicular context. Naturally, every the decision of what data needs to be exchanged, with whom
application comes with its own requirements. In the case of and when [1]. This implies that the performance of different
HIL, a logical simulation clock must be synchronized to the RTIs may be heterogeneous depending on the simulation setup.
wall clock time. For other use cases, it might be desirable for One implementation may be highly suitable for a simple use
the the simulation to run faster than real time by orders of case due to its Data Distribution Management (DDM) tweaks,
magnitude. while another might perform better regarding scalability when
There are two main approaches to couple simulators (and the scenario gets huge.
other components). First, implement a direct connection be- However, in order to compare the performance between the
tween the tools. This comes typically with low overhead, but different implementations, we constructed four small-scale test
requires manual work for every new topology and module. If cases and executed them for each RTI implementation under
more than two components should be connected, the imple- test. Hence, this paper focuses on a performance comparison
mentation effort increases and depending on the synchroniza- between the four leading HLA RTIs based on prototypical
tion concept, the performance could drop. test cases. The rest of the paper is organized as follows:
Second, a dedicated middleware could be used. Usually, in Section II, more details about the HLA, the RTIs, and
the middleware takes care of the delivery of messages be- related work is given. The experimental setup and the different
tween tools and it manages a global simulation clock. The test cases are described in Section III, while the results are
middleware can help to speed up the implementation of new presented in Section IV. Finally, a conclusion and an outlook
simulation setups by reusing existing modules and providing is drawn in Section V.
interoperability. In addition, an established middleware has
II. BACKGROUND
This work is part of the Virtual Mobility World (ViM) project and has been A. High Level Architecture
funded by the Bavarian Ministry of Economic Affairs, Regional Development
and Energy (StMWi) through the Centre Digitisation.Bavaria, an initiative of A priori, two important terms have to be clarified: federate
the Bavarian State Government. and federation. A federate is a single participant in a sim-

978-1-7281-7343-6/20/$31.00 ©2020 IEEE


115
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

time-regulating, a lookahead value needs to be defined. TSO


Federate A Federate B Federate C messages are processed in a TSO buffer, otherwise there is a
receive order (RO) buffer for RO messages.
The lookahead value describes a lower bound for sent
Runtime Infrastructure (RTI)
TSO messages. That means a time-regulating federate that has
lookahead of LA and is currently at time T cannot send TSO
Fig. 1. HLA Architecture. messages with a timestamp lower than T + LA.
A time advance can be requested via four different request
types: TimeAdvanceRequest (TAR), TimeAdvanceReques-
ulation run, hence an application that supports the interface tAvailable (TARA), Next Message Request (NMR), Next
to the RTI. This does not necessarily have to be a simulator Message Request Available (NMRA), Flush Queue Request
(e.g., it could be an application connecting a HIL device to an (FQR). When a TimeAdvanceGrant (TAG) is received, a
RTI). The federation is the set of all connected federates with federate can continue to the requested simulation time point.
the same federation object model. The running federation with In addition to the ordering modes, there are two different
federates connected by an RTI is referred to as a federation transport modes: best effort and reliable transportation.
execution. In the following, further information is given with In the light of interoperability between HLA 1.3 and HLA
reference to the current version of the HLA standard (i.e., 1516-2000, some differences are given in [3]. Related to the
IEEE 1516-2010). FOM, standard data types were introduced and the object
The motivation behind the HLA is to empower the creation model is in XML format and is Unicode encoded since HLA
of interoperable simulation systems. Therefore, a framework 1516-2000. The FOM file is directly used by the RTI, no
is crafted which can be used by developers to organize and additional FED file is needed anymore. In contrast to these
define simulation applications. The standard mentions two cosmetic changes, there are indeed changes in the behavior of
major aims leading to flexibility. First, creating interoperability some functions. For instance, the authors mention that in the
between different simulations and second, supporting model 1.3 standard existing subscriptions to attributes of an object
reuse across various domains [2]. are getting lost, when a federate subscribes to an additional
While the standard describes services and rules that are attribute. In 1516-2000, there is no automatic replacement, but
required, there is no full reference implementation. Therefore, an appending of subscriptions. The ten golden rules, defining
the HLA software does not exist. Thankfully, there are multiple the responsibilities for federates and federation, have not been
projects that implement core functionalities within a so-called modified.
RTI software. They offer federate stubs that can be integrated In HLA 1.3, there is only one method to deliver callbacks
and adapted to the existing simulation submodels [2]. to a federate. The federate has to call a tick() method
There are ten rules that defines responsibilities for federates on the RTI ambassador (evokeMultipleCallbacks()
and federations. This shall guarantee the correct interaction of since HLA 1516-2000) in order to receive pending call-
different participants. Additionally, there is the object model backs. Later synchronous polling for callbacks was called
template (OMT), which defines the structure of data models HLA EVOKE. With 1516-2000, it is also possible to use a
that are used to specify a simulation [2]. mode called HLA IMMEDIATE. In this mode, the federate
Three specific models are based on the OMT: the simulation receives callbacks immediately when they arrive at the local
object model (SOM), the federation object model (FOM), and RTI component. This possibly brings better performance, but
the management object model (MOM). The SOM is related of course having a thread-safe program is absolutely necessary
to an individual federate and gives information about the in this case [4].
interfaces of the federate, hence the possible data structures The 1516-2010 standard is also no replacement of the 1516-
a federate can publish or consume. The FOM defines all 2000 standard, but more of a progression. The functionality of
information types that can be exchanged during a federation 1516-2000 remains, while some additional features are added.
execution (e.g., objects and interactions with their parameters). Some of them are web service abilities, support for fault
Naturally, it is not strictly necessary to use all possibilities that tolerance, and a more modular structure of the FOMs [5].
are provided by different SOMs in a FOM. The MOM is used
for information apart from simulation related payload (e.g., B. Middleware Implementations
controlling and monitoring tasks). As mentioned, we focus on the four leading HLA RTIs.
Federates can take part in the time synchronization progress, The importance of different RTIs was determined by citation
but are not forced to. The participation is split into two count via Google Scholar. Winners of this challenge are
different types. A federate can be time regulating (i.e., the CERTI [6], MAK RTI [7], Pitch pRTI [8], and Portico [9].
federate has influence on progress of time and can send While the second and third are commercial products, Portico
timestamp order (TSO) messages) and/or be time constrained is released under CDDL and CERTI under GPL/LGPL. All
(i.e., the federate can only proceed when permission is granted of them provide (at least partial) support for HLA 1516-
and can receive TSO messages). The mode is not static, but can 2010. MAK and Pitch provide limited test versions (e.g.,
be changed during an execution run. When a federate becomes with a restriction to two federates), which were used for the

116
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE I communication protocols in contrast to other projects. In the


U SED RTI I MPLEMENTATIONS . current version, there is the possibility of communicating over
jgroups or shared memory (in case federates are run in the
Name Version Bindings Release date Centralized
same process) [14].
CERTI 4.0.0 C++ 2018 yes
MAK RTI 4.5 C++, Java 2018 both C. Related Work
Pitch pRTI Free 5.4.5.0 C++, Java 2018-2019 yes
Portico 2.1.0 C++, Java 2016 no Besides HLA, there are plenty other approaches that are
more or less suited for distributed simulation. Distributed
Interactive Simulation (DIS) is seen as the predecessor of HLA
performance evaluation. In the following we want to give a [15]. Similar to the HLA, DIS also comes from the military
brief introduction to the HLA implementations. sector. A major drawback of DIS is the lack of guarantees
1) CERTI: The French Aerospace Lab started to develop regarding repeatability of experiments. Messages that are
CERTI in 1996. Since 2002, CERTI is an open source project exchanged between components are typically processed in
and comes with a C++ binding. From an architectural view, the receive order [16]. Due to latency, jitter, or packet losses, mes-
CERTI RTI is split into two parts: the ambassador one (RTIA) sages are in general not delivered in a fixed order. Therefore,
and the global one (RTIG). Each federate resides with a local such distributed simulations will not be deterministic.
RTIA, which allows the communication between federate and More recent developments are the Functional Mock-up
RTIA to take place via inter process communication (IPC). Interface (FMI) [17] and the Data Distribution Service (DDS)
Then there is one global RTIG process, which communicates [18]. The FMI aims to have a standard for model exchange as
with the RTIA processes over TCP and UDP sockets [10]. well as co-simulation [19] for mainly continuous simulation
2) MAK RTI: Similar to CERTI, MAK talks about a local models. A so-called master algorithm has to be provided to
RTI component (LRC) within each federate and a global RTI trigger the different submodels. The master algorithm is not
Executive. The LRC covers more than just an RTI ambassador part of the FMI standard. The same applies to message-based
implementation, which allows running decentralized for some data exchange. Accordingly, the standard does not really cover
use cases. Connections are mainly realized via TCP sockets. distributed simulation. Therefore, there are some works on
There are four different connection types possible: a binding together HLA and FMI [20]–[23] and to combine FMI
lightweight connection, a lightweight loopback connection, an with Remote Procedure Calls (RPCs) [24]. More recently, the
rtiexec connection, and an rtiexec loopback connection. Distributed Co-Simulation Protocol (DCP) came up in order
While the lightweight mode is decentralized, it is “well- to extend FMI [25]. The DDS addresses distributed systems in
suited for many real-time federations that do not use Time- general, and not directly distributed simulation. The main issue
Management, DDM, MOM, reliable transport (TCP), or syn- is the absence of an integrated timing mechanism. This was
chronization points” [11]. Therefore, the rtiexec connection addressed by [26] with a combination of DDS and HLA and
will be used for the following measurements. by [27] with the extension of a time management mechanism.
3) Pitch pRTI: Pitch makes use of a similar terminology. Similarly, there are multiple approaches, which implement a
While a central RTI component (CRC) may reside on a distributed simulation middleware on top of general purpose
separate node or side by side to a federate, each federate has messaging systems (e.g., Apache Kafka [28]).
its own local RTI component (LRC). The CRC is known as There has already been a publication investigating the
RTIexec. It manages a federation and delegates work between performance of CERTI, MAK, and Portico that dates back
LRCs. There are federate bindings for C++ and Java. Pitch to 2009 [29]. Naturally, the authors had to use older versions,
uses TCP sockets for reliable and UDP sockets for best effort namely CERTI 3.3.0, MAK 3.2, and Portico 0.8. They took
communication. If the federates run in the same process, two federates A and B and measured the time for an attribute
communication is done via shared memory. For the best effort update to be sent by A, being received and updated a second
transport mode, multicast communication can be used as well time by B, and finally being received by A again. This was
[12]. done for both the reliable and the best effort transport option.
4) Portico: Started as jaRTI in 2005, a stable version of Additionally, the size of the attribute was changed throughout
Portico is currently available in version 2.1.0, released in 2016. multiple runs. In general, Portico was observed to be an order
The Portico project was started with funding help from the of magnitude slower than MAK and CERTI, while MAK was
Australian Defense Simulation Office. The lack of open source the fastest. With increasing attribute size (up to 1024 kB) the
RTI implementations was proclaimed as the main motivation. performance differences became smaller. Unfortunately, the
Therefore, the creation of a mature open source RTI alternative authors measured only a small number of samples (≤ 100).
to commercial RTIs was intended. Since these results are older than a decade and the other major
Major changes are currently underway to move from a fully RTI Pitch is missing, we are aiming for recent measurements
decentralized approach in version 2.1.0 to a more centralized in this work. At the same time, we are focusing on the reliable
paradigm [13]. This transition could have serious impact transport mode.
on Portico’s performance in future releases. The developers Thus, our main contribution is an up to date performance
state that due to their architecture it is very easy to change comparison of the four major HLA RTI implementations for

117
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the current version of the standard (IEEE 1516-2010). This The first three experiments were run for every available
work should help novices as well as experienced users to rank binding, namely C++ and Java. These are available for every
their own performance measures, choose an RTI, or extend our middleware except for CERTI, which only ships with a C++
test suite. The provided code shall also lower the entry barrier binding. For each run, measurements per action cycle were
into the world of HLA, which can be thoroughly, especially captured. For each test, the two test federates were once
due to poor documentation or buggy implementations. executed on the same machine (Local Run) and another time
on two different physical machines (Ethernet Run). The fourth
III. E XPERIMENTAL D ESIGN
experiment was a pure Ethernet Run. We chose the fastest
We designed four different experiments in order to compare bindings for each RTI and measured 105 cycles for every RTI
base cases between the different RTI implementations that and payload size.
were described in Subsection II-B. Therefore, we were looking In order to execute the different experiments, a virtual
at the time advance progress, object attribute updates, and machine (VM) based on CentOS 7.4 was set up with all
ownership transfer in conjunction with object attribute updates. implementations. The VMs were hosted via VirtualBox 5.2 on
In another experiment, we were interested in the impact Ubuntu 18.04 machines with an i7-2600 at 3.4 GHz and 8 GB
of the message size. Consequently, we were sending HLA RAM. 4GB of the RAM and four of the Cores were assigned
interactions with varying payload sizes. to the virtual machine. The different VMs where connected
The HLA EVOKED callback model was used for all over a Gigabit Ethernet switch, while the VirtualBox adapter
conducted experiments. A minimum waiting time of 1 was set to bridge mode.
ms and a maximum waiting time of 10 ms for an Figure 2 shows the FOM that was used throughout all the
evokeMultipleCallback() call was applied during all test cases. It contains definitions for an object with a single
tests. Naturally, with the different implementations, there integer attribute of 64 bit and an interaction with a variable
comes a variety of different parameters and possibilities for payload (HLAopaqueData). The ordering of both is in Time
fine-tuning. We will stick to the default parameter set of Stamp Order.
each RTI for the sake of comparability. In case of MAK, the The implementation of the test suite and of each test case
rtiexec connection was used. For Portico the communication can be found under [30]. Basically, there is a generic test
via jgroups was chosen, since our test federates were run in case with three functions: init(role), step(iteration),
separate processes. Because the data should be received in and finish(). While the first and the last for instance
time stamp order, the reliable communication mode was used. allow registering or destroy objects, the step function is called
iteratively until the test is finished. The duration of each call
FOM
is measured (see Algorithm 1). The role parameter allows
objects
defining a different behavior for the involved federates (e.g.,
objectClass send an interaction before advancing time).
name: HLAobjectRoot
objectClass
A. Experiments
name: TestcaseObject As mentioned before, the comparison of different RTIs
sharing: PublishSubscribe is not straight-forward due to the freedom regarding wire
attribute protocols and DDM. Therefore, simple but reproducible ex-
name: TestcaseObjectAttribute
periments were designed to get an impression of the RTIs’
dataType: TestInteger64BE
performance. Each experiment involves two federates. The
updateType: Conditional
roles 0 and 1 were assigned to the two components. Both
ownership: DivestAcquire
sharing: PublishSubscribe
of them are time regulating and time constrained.
transportation: HLAreliable 1) Time Synchronization: The two federates are continu-
order: TimeStamp ously advancing their time with a fixed and common step
interactions
length (see Algorithm 2). Both federates act identically, thus
interactionClass the role parameter is ignored. The spent wall clock time is
name: HLAinteractionRoot measured for each time advance cycle.
interactionClass
name: TestcaseInteraction
sharing: PublishSubscribe Function runTest(testcase,role):
transportation: HLAreliable testcase.init(role);
order: TimeStamp while notFinished do
timer.start();
parameter testcase.step(iteration++);
name: Payload timer.stop();
dataType: HLAopaqueData end
testcase.finish();

Fig. 2. Federation Object Model. Algorithm 1: Base test class.

118
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Function TestTimeAdvance.step(iteration): IV. E XPERIMENTAL R ESULTS


advanceTime();
return; After the design of the study and the four experiments were
Function TestObjectUpdate.step(iteration): described in the last section, the results are given below.
if role == 0 then
e = encodeValue(counter++); A. Experiment 1: Time Synchronization
att.put(e);
rtiamb.updateAttributeValues(handle, att); For this and the following experiments, the lower bound
advanceTime(); for the evokeMultipleCallback() call is set to 1 ms.
return;
Function TestRT.step(iteration): Therefore, this should be the lowest value to be observed. The
if currentlyOwning then outcome for each of the 106 measurements are presented in
e = encodeValue(lastReceivedValue+1); Figure 3 with a logarithmic scale. (While the characters C and
att.put(e);
rtiamb.updateAttributeValues(handle, att); J describe the used binding, E and L give information about
while !shouldDivest do the mode of the run). Aggregated numbers for this and the
evokeMultipleCallbacks(); following two experiments are given in Table II. For CERTI,
end
rtiamb.unconditionalAttributeOwnershipDivestiture(...); MAK, and Pitch a time advance cycle is nearly comparable.
else In average, it took around 1.39 ms, 2.29 ms, and 1.05 ms for
rtiamb.attributeOwnershipAcquisition(...); a local run. The remote runs differ only slightly.
while !currentlyOwning do
evokeMultipleCallbacks();
end
advanceTime(); Time Advance
return; CERTI CE
C: C++
Function TestInteraction.step(iteration): CERTI CL
J: Java
MAK CE
if role == 0 then MAK CL
E: Ethernet
payload = new payload[interactionParamSize]; MAK JE L: Local

e = encodeValue(payload); MAK JL
Pitch CE
parameters.put(e); Pitch CL
rtiamb.sendInteraction(handle, parameters); Pitch JE
delete payload; Pitch JL
Portico CE
advanceTime();
Portico CL
return; Portico JE
Algorithm 2: Logic of the four experiments. Portico JL
0 10 0 10 1 10 2
Cycle Duration (Milliseconds)

Fig. 3. Time synchronization (experiment 1).

2) Object Attribute Update: Again, the federates are ap-


proaching to advance their time with a common step length.
Object Attribute Update
One of the federates is updating an object attribute before each
CERTI CE C: C++
time advance request, while the other federate has subscribed CERTI CL J: Java
MAK CE E: Ethernet
to an object attribute and reflects the attribute update. For each MAK CL L: Local
MAK JE
execution of the step function, the wall clock time is measured. MAK JL
Pitch CE
Pitch CL
3) Round Trip: If the ownership of a specific object is Pitch JE
Pitch JL
available, the current value of its attribute is incremented Portico CE
by one. After the update is sent, the federate waits until a Portico CL
Portico JE
divest request arrives. Finally, the ownership is released. In the Portico JL
0 10 0 10 1 10 2
other case, the ownership for the object is requested. Then, Cycle Duration (Milliseconds)
the federate waits until the ownership is received. In both
Fig. 4. Attribute update (experiment 2).
cases, the federates return after advancing their logical clocks.
Thus, the wall clock time is measured for each round, either Roundtrip
requesting ownership and reflecting, or updating and handing CERTI CE
C: C++
CERTI CL
J: Java
over the ownership. MAK CE
E: Ethernet
MAK CL
MAK JE L: Local
4) Interaction: In contrast to experiment number 2, no MAK JL
Pitch CE
object attribute update, but an interaction is sent by the federate Pitch CL
Pitch JE
with role zero. This is done with varying payload sizes from Pitch JL
64 B to 1 MB. Both of the federates are requesting a time Portico CE
Portico CL
advance. The other federate has subscribed to the interaction Portico JE
Portico JL
and receives it, before the following time advance grant is 0 10 0 10 1 10 2
given. The federates are run on two dedicated machines that Cycle Duration (Milliseconds)

are connected via a Gigabit-Ethernet switch. Fig. 5. Round trip (experiment 3).

119
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE II
D URATION OF DIFFERENT EXPERIMENT CYCLES ( MILLISECONDS )

Time Synchronization Object Attribute Update Round Trip


Mean Median Std Min Max Mean Median Std Min Max Mean Median Std Min Max
CERTI CE 3.0 1.4 9.3 1.2 368.6 2.8 1.7 8.0 1.3 843.1 10.2 4.6 24.3 1.6 870.2
CERTI CL 1.6 1.4 6.1 1.1 655.0 2.8 1.7 12.0 1.3 1123.0 5.4 3.1 14.8 1.5 682.3
MAK CE 2.6 2.4 7.9 1.1 916.0 2.6 2.4 6.4 1.1 590.4 11.2 6.0 29.7 3.2 971.8
MAK CL 2.5 2.3 13.5 1.1 1228.8 2.9 2.3 18.3 1.1 1258.2 6.2 4.4 25.6 2.2 1125.5
MAK JE 2.5 2.4 5.5 1.1 741.9 2.4 2.4 4.0 1.1 675.8 8.7 6.0 18.0 3.2 768.6
MAK JL 2.3 2.3 9.8 1.1 962.2 2.5 2.3 10.3 1.1 1203.0 5.7 4.6 18.3 2.1 1004.0
Pitch CE 1.8 2.0 1.6 1.0 344.3 1.8 1.1 1.6 1.0 282.8 3.5 3.1 1.9 1.1 210.7
Pitch CL 1.6 1.1 1.2 1.0 201.8 1.6 1.1 0.7 1.0 33.0 3.1 3.1 2.2 1.1 603.9
Pitch JE 1.8 2.0 1.6 1.0 365.0 1.7 1.1 1.7 1.0 268.9 3.3 3.1 1.4 1.1 51.0
Pitch JL 1.7 2.0 0.8 1.0 77.6 1.6 2.0 0.7 1.0 30.7 3.1 3.1 1.0 1.1 31.8
Portico CE 35.1 30.0 46.1 0.1 1261.4 35.5 30.0 44.2 0.1 1219.9 114.8 121.0 86.0 28.1 1508.0
Portico CL 42.9 30.0 65.3 0.1 1441.0 41.8 30.0 61.4 0.1 1498.3 155.8 122.0 153.2 27.9 2331.3
Portico JE 36.6 30.0 50.2 0.05 1227.1 35.4 30.0 43.0 0.04 1003.3 114.2 120.8 82.5 22.9 1657.2
Portico JL 41.4 30.0 62.5 0.04 2042.4 44.9 30.0 77.6 0.1 1725 147.1 121.0 143.9 29.2 2128.0

The results for Portico are one order of magnitude higher. experiment, should be observed. This is the case for CERTI
A time advance cycle took 30 ms in average. The difference CL, but mostly the values remain stable. Even some smaller
to the other three RTIs can probably be explained by the numbers can be seen (e.g., 42.9 ms in mean for Portico CL
bundling option of jgroups, which is enabled by default with a in experiment 1 and 41.8 ms in this experiment). In general,
maximum waiting time of 30 ms. The decentralized approach the medians are equal or higher (compared to experiment 1),
of Portico 2.1.0 could be another reason for the slower speeds. except for the values of Pitch. While the mean values do not
The different cycle durations are additionally plotted in Fig- differ here, the medians decrease for the Ethernet runs.
ure 6. One can see that the Portico cycles mostly take around
2 ms, around 30 ms, or around 60 ms. Particularly interesting C. Experiment 3
is a look into the time series. Mostly, there are oscillating Naturally, the time effort increases, if we introduce ad-
patterns between 1 ms and 60 ms (Figure 6), which may again ditional ownership management related tasks. The object’s
be explained by jgroups and the decentralized approach. attribute value is alternatingly reflected and updated by both
Another interesting point is the lower bound in the federates. The measures show the time for one turn. For
Portico cases. There are values lying below 1 ms the median, the observed durations are in a range between
with a minimum value of 0.04 ms. Therefore, the 1.41 (Pitch JL) and 4.07 (Portico CL) times higher than in
evokeMultipleCallback() minimum waiting time experiment 2. Overall, Pitch performs best. It is followed
value seems to be ignored. This was also a problem for CERTI by CERTI and MAK. Even if more messages need to be
1516-2010, but a patch that fixes this issue is available [31]. exchanged in this experiment, the observed maximum values
When looking at the difference between the local and are not necessarily higher than in the previous experiment.
the physical distributed runs, not much discrepancy and thus However, this is the case for all Portico runs. An overall
overhead can be observed. maximum duration of 2.3 s was observed for the Portico CL
run.
B. Experiment 2
The next experiment builds on top of the first experiment. D. Experiment 4
Prior to the time advance, an object attribute (64 bit integer) A typical payload size is depending on i.a. the application
is updated and reflected in each round by one of the fed- domain, the simulation models, the input parameters, and the
erates. Hence, higher cycle durations, compared to the first distribution topology. The used sizes should cover most of
them. Based on the previous experiments, the fastest bindings
Time Advance
were used for Ethernet runs in this experiment: CERTI CE,
CERTI CE
Duration (Milliseconds)

175000
MAK CE MAK CE, Pitch JE, and Portico JE. The impact of an
50
150000
Number of Measures

100 Pitch CE interaction’s payload size is surprisingly small as depicted in


125000 Portico CE
0 0 Figure 7 and Figure 8. From 64 B to 16 kB, there is not much
100000 50 75 225 250
75000
Number of Cycle variation. Pitch is faster than MAK and CERTI. The results for
50000
Portico show the already known multi-modal distribution with
25000 a big scattering (see Figure 9). From 32 kB on, differences
0 can be observed that are given in Table III. First, Portico’s os-
0 10 20 30 40 50 60 70
Duration (Milliseconds) cillating pattern vanishes, while the average duration remains
Fig. 6. Histogram of cycle durations. on an equal level. This is probably because Portico’s jgroups

120
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

maximum bundling size defaults to 64 kB. Second, 32 kB is TABLE III


the last payload size that is delivered by MAK (at least with M EAN VALUES FOR INTERACTION DURATIONS (M ILLISECONDS )
the described parameter sets). Third, Pitch gets slightly slower
16 kB 32 kB 64 kB 128 kB 256 kB 512 kB 1 MB
than CERTI with increasing payload (>64 kB). Considering
the mean values for the bigger interaction sizes, the durations CERTI 2.61 2.93 3.24 3.38 5.57 10.28 22.69
MAK 2.45 2.46 - - - - -
are acceptable, as they are not magnitudes higher than the pure Pitch 1.73 1.69 2.94 4.77 10.14 20.12 42.24
transmission delay. For instance for 1 MB, the transmission Portico 34.78 31.52 33.10 34.37 36.14 35.12 37.39
delay would be 8.4 ms (= 1048576 ∗ 8bit ∗ (10−9 bit s
)), while
the means range between 22.69 ms and 42.24 ms. Interactions
10 2
CERTI
V. C ONCLUSION
MAK
Pitch
In this paper, we presented a performance comparison of Portico

Duration (Milliseconds)
the four most well-known HLA RTI implementations. The
findings should provide assistance to decide which RTI is
10 1
suitable for different purposes. Therefore, we provided general
information about HLA, designed four typical test cases and
discussed the test results. All implementations have their
advantages and drawbacks.
For different reasons, it is not possible to draw a general 10 0
conclusion about the single best RTI implementation from

1M
64

1k

2k

4k

8k
8

8k

6k

2k
16

32

64
12

25

51
our presented experiments. First, the four test cases do not

12

25

51
Payload Size (Bytes)
cover all aspects of HLA nor of the different RTIs’ features.
We chose a subset of functions that tended to be compara- Fig. 7. Interactions (experiment 4).
ble. Consequently, the RTIs’ default parameters were used.
Second, there is more that should be considered than just Interactions
40
Duration (Milliseconds)

the performance, when choosing an RTI. License modalities, 30 CERTI


20 MAK
support, used techniques, or additional features play also an 10 Pitch
important role and are highly dependent on the use case. 6 Portico
4
From a pure performance related view, the best RTI may 3
2
be Pitch. In contrast, even if Portico’s performance may be 0
the worst in our experiments, it is completely decentralized,

1M
64

1k

2k

4k

8k
8

8k

6k

2k
16

32

64
12

25

51

12

25

51
which might meet other certain requirements. While Pitch Payload Size (Bytes)

pRTI was in general the fastest RTI, in experiment 4 CERTI Fig. 8. Mean duration of interactions (experiment 4).
became slightly faster with an increasing payload size (>64
kB). A complete comparison is not feasible, but with the used Portico: Interactions
and published test suite, the reproduction of results and the 8000
Number of Measures

8 kB Payload
extension with additional test cases is easily possible. 6000
16 kB Payload
32 kB Payload
When contrasting our findings with the existing work on 4000 64 kB Payload
HLA performance [29], it is necessary to take into account the 2000
hardware evolution, the different versions of the standard, and
0
the evolved RTI implementations. However, some differences 0 10 20 30 40 50 60 70
Duration (Milliseconds)
are noteworthy: in our case, CERTI performs best for an
interaction with 1 MB payload, while CERTI performs worst Fig. 9. Portico: multi modal distribution (experiment 4).
in their measurements for a round trip with that payload
size. In all other cases, Portico is the slowest RTI, which is
R EFERENCES
consistent with our numbers.
For future research, it would be interesting to involve more [1] L. Granowetter, “RTI interoperability issues–api standards, wire stan-
federates and observe the impact of multicast communication. dards, and RTI bridges,” in Proceedings of the 2003 European Simula-
tion Interoperability Workshop, no. 03S-SIW, 2003, p. 063.
To see the performance of region based DDM strategies in [2] “IEEE standard for modeling and simulation (m s) high level architecture
more complex test scenarios would be another relevant point, (hla)– framework and rules - redline,” IEEE Std 1516-2010 (Revision of
as well as using the HLA IMMEDIATE callback model. When IEEE Std 1516-2000) - Redline, pp. 1–38, 2010.
[3] B. Möller and L. Olsson, “Practical experiences from HLA 1.3 to HLA
version 2.2 of Portico is published, the performance impact of IEEE 1516 interoperability,” 04f-siw-045, www. sisostds. org, 2004.
the central component is another open question. [4] B. Möller, P.-P. Sollin, M. Karlsson, and F. Antelius, “Early experiences
Having base case benchmarks for HLA implementations, a from migrating to the HLA evolved c++ and java apis,” in Spring
Simulation Interoperability Workshop, 2009.
comparison to other distributed simulation middlewares such [5] B. Möller, K. L. Morse, M. Lightner, R. Little, and R. Lutz, “HLA
as DDS (with similar adapted test cases) sounds tempting. evolved–a summary of major technical improvements,” in Proceedings

121
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

of 2008 Spring Simulation Interoperability Workshop, 08F-SIW-064, [30] M. Gütlein, “Implementation of experiments,” accessed 27.05.2020.
2008. [Online]. Available: https://github.com/cs7org/HLAPerformance
[6] CERTI, “CERTI project,” accessed 22.05.2020. [Online]. Available: [31] CERTI, “1516e timing patch,” accessed 22.05.2020. [Online]. Available:
https://savannah.nongnu.org/projects/certi/ https://savannah.nongnu.org/bugs/?56284
[7] MÄK, “MÄK RTI,” accessed 22.05.2020. [Online]. Available: https:
//www.mak.com/products/link/mak-rti
[8] Pitch technologies, “Pitch pRTI,” accessed 22.05.2020. [Online].
Available: http://pitchtechnologies.com/products/prti/
[9] Portico, “poRTIco project,” accessed 22.05.2020. [Online]. Available:
http://www.porticoproject.org
[10] E. Noulard, J.-Y. Rousselot, and P. Siron, “Certi, an open source rti,
why and how,” 2009.
[11] MÄK, “Lightweight mode,” accessed 22.05.2020.
[Online]. Available: https://www.mak.com/products/link/mak-rti#
lightweight-mode-supports-rapid-development-and-real-time-federations
[12] Pitch, “Pitch pRTI USER’S GUIDEv 5.4,” 2019.
[13] T. Roth, M. Burns, and T. Pokorny, “Extending portico HLA to feder-
ations of federations with transport layer security,” 2018.
[14] Portico, “Architectural overview of portico,” accessed 22.05.2020. [On-
line]. Available: http://portico.openlvc.org/index.php$?$title=Portico
Architectural Overview
[15] P. Ryan, P. Ross, and W. Oliver, “Distributed interactive simulation
revisited: Capabilities of the revised IEEE standard,” 1994.
[16] R. M. Fujimoto and R. M. Weatherly, “HLA time management and dis,”
in Proceedings of 14th Workshop on Distributed Interactive Simulation,
1996.
[17] Modelica, “Fmi 2.0.1 specification,” accessed 22.05.2020. [Online].
Available: https://github.com/modelica/fmi-standard/releases/download/
v2.0.1/FMI-Specification-2.0.1.pdf
[18] G. Pardo-Castellote, “OMG data-distribution service: architectural
overview,” in 23rd International Conference on Distributed Computing
Systems Workshops, 2003. Proceedings. IEEE, 2003, p. 200206.
[19] T. Nouidui, M. Wetter, and W. Zuo, “Functional mock-up unit for
co-simulation import in energyplus,” Journal of Building Performance
Simulation, vol. 7, no. 3, pp. 192–202, 2014.
[20] M. U. Awais, P. Palensky, A. Elsheikh, E. Widl, and S. Matthias,
“The high level architecture RTI as a master to the functional mock-up
interface components,” in 2013 International Conference on Computing,
Networking and Communications (ICNC), 2013, pp. 315–320.
[21] M. U. Awais, M. Cvetkovic, and P. Palensky, “Hybrid simulation using
implicit solver coupling with HLA and fmi,” International Journal
of Modeling, Simulation, and Scientific Computing, vol. 8, no. 04, p.
1750055, 2017.
[22] Y. Bouanan, S. Gorecki, J. Ribault, G. Zacharewicz, and N. Perry,
“Including in HLA federation functional mockup units for supporting
interoperability and reusability in distributed simulation,” in Proceedings
of the 50th Computer Simulation Conference, ser. SummerSim 18. San
Diego, CA, USA: Society for Computer Simulation International, 2018.
[23] N. Sievert, “Modelica models in a distributed environment using fmi
and hla,” 2016, Thesis. [Online]. Available: http://www.diva-portal.org/
smash/record.jsf?pid=diva2%3A971217&dswid=-6326
[24] L. I. Hatledal, H. Zhang, A. Styve, and G. Hovland, “FMU-proxy: A
Framework for Distributed Access to Functional Mock-up Units,” Feb
2019, p. 7986.
[25] M. Krammer, M. Benedikt, T. Blochwitz, K. Alekeish, N. Amringer,
C. Kater, S. Materne, R. Ruvalcaba, K. Schuch, J. Zehetner, M. Damm-
Norwig, V. Schreiber, N. Nagarajan, I. Corral, T. Sparber, S. Klein, and
J. Andert, “The Distributed Co-Simulation Protocol for the Integration
of Real-Time Systems and Simulation Environments,” in Proceedings
of the 50th Computer Simulation Conference, 2018. [Online]. Available:
https://dl.acm.org/citation.cfm?id=3275383
[26] Y. Park and D. Min, “Development of HLA-DDS wrapper API for
network-controllable distributed simulation,” in 2013 7th International
Conference on Application of Information and Communication Tech-
nologies. IEEE, 2013.
[27] W. Baron, C. Sippl, K.-S. Hielscher, and R. German, “Repeatable
Simulation for Highly Automated Driving Development and Testing,”
in 2020 IEEE 91st Vehicular Technology Conference. IEEE, 2020.
[28] M. Gütlein and A. Djanatliev, “Modeling and simulation as a service
using Apache Kafka,” in Proceedings of the 10th International Con-
ference on Simulation and Modeling Methodologies, Technologies and
Applications, ser. SIMULTECH 2020, 2020.
[29] L. Malinga and W. H. Le Roux, “HLA RTI performance evaluation,”
2009.

122
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Reduction of Inter-process Communication in


Distributed Simulation of Road Traffic
Tomas Potuzak
Department of Computer Science and Engineering, Faculty of Applied Sciences
Umiversity of West Bohemia
Plzen, Czech Republic
tpotuzak@kiv.zcu.cz

Abstract—A detailed computer simulation is an important The communication protocols in many existing distributed
tool for the managing of road traffic. Since it can be very time road traffic simulator successfully ensure both the synchroniza-
consuming, it is often performed in a distributed computing tion and the transfer of vehicles. Nevertheless, only in some
environment. The simulated road traffic network is then divided cases, an effort was invested into the minimization of the inter-
into sub-networks simulated by processes on the nodes of the process communication, although it is often the main bottle-
distributed computer. The inter-process communication necessa- neck of distributed applications. Hence, we explored possible
ry for the vehicle transfer and the synchronization can then ways for the reduction of inter-process communication. The
significantly influence the performance of the distributed road results of this research are two efficient communication proto-
traffic simulation. In this paper, two efficient communication
cols – the Long Step (LS) and the Long Step Binary (LSB)
protocols for distributed road traffic simulation, which we
protocols, whose basic functioning is described in [6] and [7]
developed during our previous research, are compared. These
protocols – the Long Step (LS) protocol and the Long Step in detail. There are three variants of both protocols. The com-
Binary (LSB) protocol – reduce the inter-process communication parison of these variants using a thorough testing is the main
using an aggregate message transfer and/or a lossy data contribution of this paper. The tests were performed using the
compression. Semi-centralized, centralized, and distributed Distributed Urban Traffic Simulator (DUTS) developed at De-
variants of both protocols were thoroughly tested and compared partment of Computer Science and Engineering of University
to a reference communication protocol representing a common of West Bohemia (DCSE UWB). However, they principles are
protocol of distributed road traffic simulators. The tests indicate utilizable for other distributed road traffic simulators as well.
significant savings of the number of transferred messages and,
more importantly, of the total computation time. II. DISTRIBUTED ROAD TRAFFIC SIMULATION
In order to make the further reading clearer, the basic
Keywords—road traffic, distributed simulation, communication notions of road traffic simulation and its distributed version are
reduction, aggregate transfer, lossy data compression briefly explained in following subsections.
I. INTRODUCTION A. Important Features of Road Traffic Simulation
The road traffic density on highways and especially in The time-flow mechanism determines the way the simula-
cities is steadily increasing. A computer simulation is one of tion time is advanced. Two mechanisms are commonly used –
the important tools for the managing of road traffic. It can be the time-stepped one and the event-driven one. Using the
used for analysis of existing road traffic networks and former mechanism, the entire simulation state is periodically
improvement of their performance, for prediction of recomputed [8]. The period (called time step) is usually one
consequences of a road closure, and so on. In order to model second (e.g., in [1], [5], [9]), but is not the only possible value
the real road traffic situations accurately, the simulation must (e.g., in [10], 0.1 seconds is used). Using the latter mechanism,
be very detailed. Very often, multiple simulation runs of a the simulation state changes by interpreting events. An event
single scenario are required in order to guarantee the fidelity of incorporates an action (i.e., an incremental change of the
the results. Because of these two requirements, the road traffic simulation state) and a time stamp indicating when the actions
simulation is very time-consuming, especially for large road should be performed. So, the simulation time advances from
traffic networks (e.g., large cities or entire states). Hence, some one time stamp to another [8].
existing road traffic simulators (e.g., [1], [2], [3], [4], [5]) were
adapted for a distributed computing environment where the The level of detail of the road traffic simulation determines
combined computing power of multiple interconnected (single- its fidelity on one side and its speed on another. The
core or multi-core) computers is used for a faster execution of macroscopic simulation deals only with aggregated traffic
the simulation. Common approach is that the road traffic flows in individual roads. These models are very fast and also
network is divided into sub-networks, whose simulations are very old [11]. Both time flow mechanisms are commonly used.
then performed as processes on individual computers (called The mesoscopic simulation adds some form of individual
nodes) of the distributed computer. Communication links are vehicles, but modeling of their mutual interactions is limited
maintained among the processes to ensure their mutual [12], which makes the mesoscopic simulations also very fast.
synchronization and the transfer of vehicles between the Again, both time flow mechanisms are commonly used. In the
neighboring sub-networks. microscopic simulation, every single vehicle is modeled as an
object with its own position, direction, speed, and acceleration.
This work was supported by Institutional support for long-term strategic
development of research organizations.
978-1-7281-7343-6/20/$31.00 ©2020 IEEE
123
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

The simulated vehicles drive in roads, change their directions traffic lane is congested or become passable again. These
and traffic lanes, and interact with each other. The time- messages travel in opposite direction than the vehicles and can
stepped time-flow mechanism is most often used. There are significantly vary in their form (from a simple bit indication
two widely used microscopic traffic models – the cellular [19] to complex information about the traffic situation near the
automaton model dividing the traffic lanes into equally-sized end of the traffic lane [1], [20]).
traffic cells [13] and the car-following model enabling to place
a vehicle at any position in the traffic lane [14]. In both models, To ensure that the transferred vehicles and lane-blocks
the vehicles tend to accelerate to the maximal allowed speed arrive to the neighboring sub-networks in a correct time step,
and decelerate when there is an obstacle in their way (e.g., a all simulation processes must perform the same time step at the
slower vehicle). Both models also often incorporate random same moment. Usually, the processes are synchronized using a
deceleration in order to emulate natural fluctuations of the barrier. This barrier can be provided by a central (control)
speed [13], [14]. Since the microscopic simulations model the process [1], [5], [21], [22] or can be distributed among the
real traffic at a high level of detail, they are also much more working processes [15], [21], [23]. In both cases, the synchro-
computation-consuming, which is the reason, why they are nization mechanism requires additional messages to be sent
often performed in distributed computing environment. among the processes. In case of a centralized barrier, the
synchronization messages are transferred between the control
B. Important Features of Distributed Road Traffic Simulation process and the working processes. In case of a distributed
So, the primary reason for adaption of a road traffic barrier, the synchronization messages are transferred only
simulation for a distributed computing environment is its among the working processes. The vehicles and lane-blocks are
speedup. As it was said in Section I, a distributed computer often transferred directly between the working processes simu-
consists of multiple computers (nodes) interconnected by a lating the neighboring sub-networks [1], [15], [23]. An alter-
computer network (usually Ethernet). An example can be native is to transfer them via the control process [2], [6], [22].
ordinary workstations at a university classroom. There is no III. RELATED WORK
shared memory among the nodes. So, the only means of
communication is the message passing [8]. It should be noted Since our communication protocols are focused on the
that the nodes of the distributed computer can and often do inter-process communication reduction, related papers at least
incorporate multi-core processors. This distributed/parallel partially dealing with the minimization of the inter-process
computing environment can be used for an additional speedup communication are mentioned in following subsections.
of the distributed road traffic simulation, for example by empl- A. Reduction of Sent Messages Count
oying multi-threaded simulation processes (see [5] for details).
However, this does not influence the message passing between A way, how to reduce the inter-process communication, is
the processes, so we will not go in further details in this paper. to reduce the sent messages count. This approach is used in
[23]. There, a distributed synchronization is performed by
The distributed road traffic simulation consists of (possibly exchanging synchronization messages between the processes
multithreaded) processes running on the particular nodes of the simulating neighboring sub-networks (“neighboring processes”
distributed computer. So, there are two main issues, which for short) only. This “semi-optimistic” approach requires a
must be solved prior its execution – how to divide the lower messages count than the “all-to-all” approach used for
simulation into processes (i.e., the decomposition) and how example in [15], where each process sends a synchronization
these processes will communicate (i.e., the inter-process message to all other processes. Also, the vehicles and lane-
communication). In the field of road traffic simulation, the blocks are transferred via the synchronization messages [23].
spatial decomposition is most common. It is used for example
in [1], [3], [4], [5], [15]. Most often, the simulated road traffic A similar distributed synchronization between the neigh-
network is divided into sub-networks, which are then simulated boring processes only is described in [21] together with the
by particular working processes on the nodes of the distributed classical master-slaves synchronization. In both synchroniza-
computer. The inter-process communication is then necessary tion types, the vehicles and lane-blocks are sent directly
for the transfer of vehicles traveling from one sub-network to a between the neighboring sub-networks [21].
neighboring one. A special case of spatial decomposition is the The transfer of vehicles and lane-blocks via the synchroni-
uniform division of vehicles among processes, not the division zation messages is utilized also in [22]. However, the synchro-
of road traffic network into sub-networks (see [9] or [16]). nization is a centralized one with a master process controlling
There are also some examples of utilization of the temporal the transfer of all data in the distributed simulation [22].
decomposition, which divides the road traffic simulation run
into time intervals [17], and the task parallelization, which The messages count is also reduced in [10], where the
divides the simulation program into modules [18]. However, edges of neighboring sub-networks have overlapping regions
these two decompositions are quite rare and we will consider acting as buffers for vehicles. The contents of the buffers are
only the road traffic network division further in the text. exchanged between the neighbors only once per 300 time steps
(1 time step is 0.1 seconds) and the sizes of the buffers is set
The inter-process communication ensured by a communica- based on this time and the maximal speed of 12 mps. The
tion protocol is necessary primarily for the transfer of vehicles synchronization is performed during the exchange of buffers
and lane-blocks between the neighboring sub-networks in only [10]. This is possible, since when there is no transfer of
traffic lanes crossing the boundary between two sub-networks vehicles, no simulation inconsistencies can arise and no
(so called divided lanes). The lane-blocks indicate that the synchronization is needed.

124
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

A similar but more elaborated approach is used in [20].


Each individual sub-network has (possibly multi-level) exten-
ded regions on its border dubbed “extended layers” acting as
“images” of the border regions of the neighboring sub-
networks. The “images” of vehicles moving in the neighboring
sub-networks move in these extended layers. This enables to Fig. 1. Basic elements of a small road traffic network
reduce the frequency of the synchronizations and the vehicles
and lane-block transfers. However, since a part of the computa- The TRANSIMS-based traffic model utilizes the original
tion is replicated, a trade-off must be found between the Nagel-Schreckenberg’s cellular automaton model [13] and one
additional computation burden and the reduced inter-process second long time step. The main differences from the JUTS-
communication. Moreover, the exactly same behavior of the based model are the different traffic cell length (7.5 meters)
vehicle images can be difficult in stochastic simulations [20]. and single length of all vehicles corresponding to it [5].
B. Uncommon Approaches The AIMSUN-based model is based on a modified Gipps’
Although the majority of distributed road traffic simulators car-following model [14]. The traffic lanes are not divided into
use road traffic network division into sub-networks and the cells and a vehicle can be placed at any position of the traffic
time-stepped time-flow mechanism, there are also some lane, in which it is moving. Similarly to the JUTS-based
exceptions. A discrete-event time-flow mechanism is used in model, the various lengths of vehicles are considered. The
[24]. There, the vehicles are allocated to logical processes in a length of the time step is again one second [5].
way minimizing the dependencies between the vehicles and B. Road Traffic Network Structure
thus also minimizing the interactions of the logical processes
In all the traffic models, the road traffic network consists of
(i.e., the inter-process communication) [24].
traffic lanes, crossroads, curves, generators, and terminators.
An alternative spatial decomposition is used in [9]. There, The curves represent turning points in otherwise straight traffic
the vehicles are distributed evenly based on their routes – the lanes and also their branching or merging (see Fig. 1). The
nearby vehicles interacting with each other are grouped toge- generators generate the vehicles incoming to the simulated
ther. This shall minimize the inter-process communication [9]. road traffic network at its boundaries using pseudorandom
numbers generators with an exponential distribution. The
The even distribution of vehicles is used also in [16] toge- terminators remove the vehicles leaving the simulated road
ther with the classical approach. Both approaches are combined traffic network [5]. An example of a small road traffic network
with a car-following traffic model and a fundamental-diagram- with all its basic elements is depicted in Fig. 1.
based traffic model. All four combinations are thoroughly
tested and various dependencies of the efficiency are reported C. Road Traffic Network Division
(e.g., dependency on the number of vehicles, on the number of The road traffic networks used for the testing were divided
distributed computer nodes, etc.) [16]. manually into required number of sub-networks by marking the
traffic lanes, which shall be divided. These lanes were then
IV. DISTRIBUTED URBAN TRAFFIC SIMULATOR
automatically cut in their midpoints. One half of the traffic lane
The communication protocols, which are described further is equipped by a terminator and the other half by a generator
in the text (see Section V), were tested using the Distributed depending on the lane’s travel direction (see Fig. 2). The termi-
Urban Traffic Simulator (DUTS) developed at DCSE UWB. nator and the generator are linked together and form a termina-
The DUTS system is a microscopic time-stepped simulator of tor-generator pair, which is used for the transfer of vehicles
urban road traffic for a distributed computing environment and lane-blocks between the neighboring sub-networks [5].
written in Java programming language [5].
D. Distributed Execution
A. Traffic Models The DUTS system can be executed on a single computer,
The DUTS system incorporates three microscopic traffic but it is primarily designed for the execution on a distributed
models inspired by three existing road traffic simulators – the computer. Each sub-network is simulated by one simulation
JUTS [25], the TRANSIMS [1], and the AIMSUN [26]. This process (possibly multi-threaded), usually residing on a single
enables to demonstrate the universality of the developed node of the distributed computer [5]. For the synchronization
communication protocols. and the transfer of vehicles and lane-blocks, the DUTS system
incorporates several communication protocols, including all the
The JUTS-based traffic model is based on the JUTS system
protocols described in this paper (see Section V).
developed at DCSE UWB as well. It utilizes a modified Nagel-
Schreckenberg’s cellular automaton model with cell size of 2.5
meters [13], [25]. Each traffic cell can be empty or occupied by
a single vehicle. On the other hand, a vehicle can occupy
multiple cells (one to six) depending on its length [25]. The
length of the time step is one second. For the calculation of the
new positions of the vehicles in each time step, the rules from
the original Nagel-Schreckenberg’s model [13] were adopted
[5].
Fig. 2. A division of a road traffic network into two sub-networks

125
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

V. COMMUNICATION PROTOCOLS
The LS and LSB communication protocols, which we
developed, reduce of the inter-process communication using
two approaches – the reduction of the number of sent messages
(see Section III.A) and the reduction of the amount of
transferred data [7]. The basic functioning of the protocols was
described in [6] and [7], respectively. Each protocol has three
variants, which are described in Section V.B and Section V.C,
respectively. The reference protocol representing commonly Fig. 3. The functioning of the SC-SV protocol
used protocols is described in Section V.A.
ming to it are stopped (h). When the congestion in the
A. Reference Communication Protocol generator’s traffic lane is over, the generator creates a lane-
The reference communication protocol – the Semi- block indicating that the traffic lane is clear and sends it to the
centralized single vehicle (SC-SV) protocol – does not employ terminator, which then becomes passable again.
any inter-process communication reduction. The protocol is The described implementation of the SC-SV protocol is
lossless and does not introduce any error into the simulation. usable by all three traffic models. No modifications are needed.
The “semi-centralized” attribute of the protocol means that B. Long Step Protocol
a centralized barrier provided by a control process is utilized
for the synchronization and the vehicles and lane-blocks are The Long Step (LS) protocol utilizes the aggregation of
transferred directly between the neighboring working transferred vehicles and lane-blocks for a significant reduction
processes. This is quite common approach utilized for example of the number of transferred messages. The vehicles and lane-
in [1], [21], [22]. There are two synchronization messages per blocks are aggregated both spatially and temporally. The
working process per time step – the notification sent from a spatial aggregation means that the vehicles and lane-blocks
working process to the control process to indicate the finish of from multiple traffic lanes are sent in one message. For
current time step computation and the permission sent from the example, the outgoing vehicles and lane-blocks from all lanes
control process to all working processes once all notifications leading from a single sub-network to a single neighboring sub-
were received enabling them to continue with next time step. network are transferred in one message instead of being
The “single vehicle” attribute of the protocol means that a transferred separately [6]. The specifics of the aggregation
single vehicle (or a lane-block) is transferred in a single depends on the variant of the LS protocol (see bellow).
message. This is used for example in [15]. So, the total number The temporal aggregation means that the vehicles and lane-
of transferred vehicle/lane-block messages corresponds to the blocks from multiple time steps are sent in one message
number of transferred vehicles and lane-blocks and can vary in regularly once per several time steps. The number of time steps
individual steps. The total number of messages sent per one between two successive transfers of vehicles and lane-blocks
time step by the SC-SV protocol can be expressed as: are designated as long step. This is possible, because the
movement of the vehicles in a single traffic lane is affected
P only by the vehicles themselves. Thus, the vehicles in a single
M SC − SV = 2 P + ∑ (Vi + Li ) , (1) lane, which shall be transferred to a neighboring sub-network
i =1 throughout the long step, are stored in a buffer. After the long
step period is elapsed, the entire content of the buffer is
where P is the number of working processes, Vi is the number transferred at once to the neighboring sub-network. The lane-
of vehicles sent by the ith working process in the time step, and blocks are also sent once per long step and have a different
Li is the number of lane-blocks sent by the ith working process form. Instead of simply indicate that a lane become congested
in the time step. or passable, they carry information about the available space in
the traffic lane. Since the synchronization of the working
The implementation of the SC-SV protocol in the DUTS processes is necessary only because of the transfer of vehicles
system works as follows (see Fig. 3). The vehicles and lane- and lane-blocks, it is performed only once per long step
blocks are transferred using the terminator-generator pairs (see together with the transfer of vehicles and lane-blocks itself [6].
Section IV.C). When a vehicle reaches the terminator, it is
removed from the lane (a) and its parameters (e.g., speed, As arise from previous paragraphs, the protocol is lossless,
length, etc.) are packed into a message (b). This message is since it transfers all information about all vehicles and lane-
sent to the neighboring working process using established blocks. The combination of both the spatial and the temporal
communication link where it is forwarded to the corresponding aggregation means that the number of messages sent per time
generator (c). The generator unpacks the message, creates a step of LS protocol is very low. Nevertheless, the exact number
new vehicle with the received parameters, and inserts it into the depends on the LS protocol variant – the semi-centralized
traffic lane (d). When a generator cannot insert a new vehicle (SC), centralized (C), and distributed (D) one (see Fig. 4).
to its congested traffic lane (e), it creates a lane-block
The SC-LS variant utilizes a centralized barrier provided by
indicating congestion and packs it into a message (f). This
the control process for the synchronization and the aggregated
message is sent to the neighboring working process and
vehicles and lane-blocks are transferred directly between the
forwarded to the corresponding terminator (g). The terminator
neighboring working processes (see Fig. 4a). Similarly to the
then becomes not passable, which means that the vehicles inco-

126
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the aggregated vehicles and lane-blocks are transferred within


the synchronization messages using the control process as a
message router (see Fig. 4b) [6]. The synchronization is again
performed once per long step. In each notification message
(traveling from a working to the control process), all outgoing
vehicles and lane-blocks from the lanes leading from a single
sub-network to all its neighboring sub-networks are aggrega-
ted. The control process receives the notification messages
from all working processes, rearranges their contents into the
permission messages, and sends the permission messages to the
working processes. In each permission message, all incoming
vehicles and lane-blocks from the lanes leading to a single sub-
network from all its neighboring sub-networks are aggregated
(see Fig. 4b). Thus, the total number of transferred messages
per one time step sent by the C-LS variant can be expressed as:

2P
M C − LS = , (3)
TLS

where the P is the number of working processes and the TLS is


the length of the long step.
In the D-LS variant, the control process exists only to start
and end the simulation run (as such, it is not shown in Fig. 4c).
All inter-process communication during the simulation run is
performed directly between the neighboring working proces-
ses. Similarly to the C-LS variant, the synchronization messa-
ges and the vehicle/lane-block messages are merged together
and are transferred only once per long step. Unlike both pre-
vious variants, only the notifications are used, no permissions.
Each working process sends a notification to all its neighboring
working processes. Each notification incorporates aggregated
vehicles and lane-blocks from the lanes leading from a single
sub-network to its single neighboring sub-network. Each
working process waits for the notifications from all its
neighboring slave processes. Once they all arrive, the working
Fig. 4. The comparison of the LS protocol variants process can continue with its computation of the next long step.
The total number of transferred messages per one time step
SC-SV protocol, there are two synchronization messages per sent by the D-LS variant can be expressed as:
working process (a notification and a permission), but they are
sent only once per long step. All the outgoing vehicles and P
lane-blocks from the lanes leading from a single sub-network
to its single neighboring sub-network are transferred aggrega-
∑N
i =1
i
M D − LS = , (4)
ted in one message. These aggregated vehicle/lane-block TLS
messages are transferred regularly once per long step. So, the
total number of messages sent per one time step by the SC-LS
variant can be expressed as: where P is the number of working processes, Ni is the number
of neighboring working processes of the ith process, and TLS is
P
the length of the long step.
2P + ∑ Ni Regardless the utilized variant, the implementation of the
i =1 LS protocol in the DUTS system work as follows (see Fig. 5).
M SC − LS = , (2)
Similarly to the SC-SV protocol, the ends of the divided traffic
TLS
lanes are equipped by the terminator-generator pairs. However,
in addition, both the generator and the terminator of each pair
where P is the number of working processes, Ni is the number incorporate a buffer of traffic cells. The length of the buffer
of neighboring working processes of the ith process, and TLS is corresponds to the maximal distance, which the vehicles can
the length of the long step. travel during the long step period. In the example in Fig. 5, the
The C-LS variant also utilizes a centralized barrier provi- JUTS-based model is used, the long step length is 2 steps, and
ded by the control process for the synchronization. Moreover, the maximal vehicle speed is 6 cells per time step (cpts). Thus,

127
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

C. Long Step Binary Protocol


The Long Step Binary (LSB) protocol is based on the LS
protocol. It has the same three variants and also many other
features of the LSB protocol (including the utilization of the
long step, terminator-generator pairs with buffers, form of
lane-blocks, etc.) are the same as the features of the LS
protocol described in Section V.B. Thus, the total number of
messages transferred per one time step is the same for the
corresponding variants of the LS and LSB protocols.
The only difference is the transfer of vehicles. Instead of
sending individual vehicles and all their parameters, the
positions and lengths of the particular vehicles are transferred.
The information about the positions and lengths of the vehicles
Fig. 5. The functioning of the LS protocol
in a buffer of a terminator is coded into two integer values
the buffer length is set to 12 cells. Throughout the long step regardless the number of vehicles within the buffer. Such
period, the vehicles in the lanes 1 and 2 in the sub-network 1 encoding leads to a significant reduction of the amount of
are heading to the terminators at the end of the lanes. When a transferred data, although the number of messages remains the
vehicle reaches the terminator, it can enter its buffer as it would same as using the LS protocol. Because some information
be a continuation of the lane. The lane 1 in the sub-network 2 about the vehicles is lost during the transfer, the LSB protocol
is empty and the lane 2 in the sub-network 2 is partially is a lossy one [7].
congested. The implementation of the LSB protocol in the DUTS
At the end of each long step, the vehicles in the sub- system works as follows. Again, the ends of the divided traffic
network 1 are removed from the buffers of the terminators (a), lanes are equipped by the terminator-generator pairs incorpora-
their parameters are packed into a message and sent using a ting buffers of traffic cells with the same purpose as when
communication link (depending on the protocol variant) to the using the LS protocol. The situation is depicted in Fig. 6 where
sub-network 2 (b). There, the message is unpacked and the the JUTS-based model is considered, the long step length is 2
vehicles are forwarded to the corresponding generators (c). steps and the maximal vehicle speed is 6 cpts. Thus, the length
Simultaneously, in the sub-network 2, the available empty of the buffer is set to 12 cells.
spaces in the lanes 1 and 2 are checked and saved into the lane- When the vehicles in the terminator’s buffer shall be
blocks (d). The lane-blocks are packed into a message and sent transferred, the binary encoding is performed. Each buffer cell
using the communication link to the sub-network 1 (e). There, is represented by one bit in the first integer value. If the traffic
the message is unpacked and the lane-blocks are forwarded to cell is occupied by a vehicle, the bit is set to 1, otherwise to 0.
the corresponding terminators (f) [6]. However, from this description, it is not possible to determine
The lane 1 in the sub-network 2 was empty prior to the neither the number of vehicles nor their lengths. Therefore,
transfer of the messages. Thus, the vehicles received from the there is another integer value, whose bits represent the
sub-network 1 could be inserted into it easily. However, the connections among the cells (i.e., which occupied cells belong
lane 2 in the sub-network 2 was partially congested with only 6 to one vehicle – see Fig. 6) [7].
unoccupied traffic cells left (see Fig. 5). In this space, only half Together with the positions and connections integer values,
of the received vehicles could fit in. To handle the remaining the mean speed of the vehicles is transferred. The positions and
half, the generator is equipped with a buffer of traffic cells connections integer values and the lane-blocks from multiple
similar to the terminator, in which the remaining half of the traffic lanes are aggregated the same way as using the LS
vehicles can be placed. The lane-blocks with available space in protocol, depending on the utilized variant of the LSB protocol
the lane (including the generator’s buffer) are sent from the
generators to the corresponding terminators (see Fig. 5). The
terminator is then able to block the same part of its buffer,
which is occupied by the vehicles in the generator’s buffer. If
the generator’s buffer is completely full, the entire terminator’s
buffer is blocked and no vehicles can pass the boundary of the
sub-networks. This approach guarantees that no overflowing
vehicles will be transferred from the terminator to the generator
[6].
Similarly to the SC-SV protocol, the LS protocol is
applicable for all three traffic models of the DUTS system.
There is only a minor difference for the AIMSUN-based
model. Since this model does not utilize the traffic cells, the
buffers of the terminator-generator pairs are not constructed
from traffic cells. Instead, the buffers are plane traffic lanes, in
which the vehicles can be placed at any positions. Fig. 6. The functioning of the LSB protocol

128
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 7. The rasterization of the postions of the vehicles


Fig. 8. The road traffic network used for the divided lanes count testing
(i.e., the SC-LSB, the C-LSB, or the D-LSB). When the two
integer values together with the mean speed of the vehicles are time and the number of transferred messages. For the computa-
received by the corresponding generator, it is able to tion time to be a representative metric, the simulation itself
reconstruct the content of the terminator’s buffer in its own must consume similar amount of computation time for all test.
buffer. After that, the vehicles are shifted forward to the The inter-process communication should also consume a major
succeeding lane by the length of the buffer, if possible (i.e. portion of the computation time. Then, the differences of the
there is no congestion blocking the lane). If not, all or some computation time caused by the communication protocols are
vehicles remain in the generator’s buffer. The lane-blocks work easily observable. Hence, a small road traffic network with 32
as in the LS protocol [7]. crossroads was used (see Fig. 8). The crossroads were arranged
The LSB protocol is applicable for all three traffic models into two rows and the road traffic network was divided betwe-
of the DUTS system. However, a slight modification is en these rows. The number of divided traffic lanes ranged
necessary for the AIMSUN-based model. Using the AIMSUN- between 2 and 32. The lanes, which should interconnect the
based model, which is based on a car-following model, there sub-networks, but do not, because there are less than 32
are no traffic cells in the lane and the vehicles can be located at connecting lanes, are not turned into dead-ends. Instead, the
any position in the lane. Nevertheless, the lane can be neighboring lanes with the opposite travel directions are conne-
rasterized into cells and the LSB protocol can be used only cted together near the boundary of the sub-networks using the
with a small deviation of the vehicle positions (see Fig. 7). curve element. The vehicle density in the divided lanes was
maintained at roughly 0.25 vehicles per time step (vpts).
It should be noted that the positions and lengths of the
vehicles are transferred errorless for the cellular automaton The tests were performed for all variants of the LS and
models. However, for the car-following models, the positions LSB protocols and for the reference SC-SV protocol. Each
of the vehicles can be transferred with a small deviation simulation run was 1000 time steps long (with 1000 time steps
(because of the rasterization). Hence, the protocol is more performed prior the measurement – a warm-up period) and
precise for the cellular automaton models (TRANSIMS- and each result was averaged from ten simulation runs. The number
JUTS-based in the DUTS system) than for the car-following of divided traffic lanes ranged from 2 to 32.
models (AIMSUN-based in the DUTS system) as confirmed by The dependency of the computation time of the particular
tests in [7]. The deviations of the real speeds of the vehicles communication protocols on the number of divided traffic
from the transferred mean speed are similar for all traffic lanes is depicted in Fig. 9. As can be seen, the computation
models. Nevertheless, the total error introduced into the time increases with the increasing number of divided traffic
simulation is very low [7]. lanes. The reason is that, for more traffic lanes, more vehicles
VI. TESTS AND RESULTS and lane-blocks are transferred (considering a constant vehicle
density in the lanes). All the variants of the LS and LSB
All variants of the developed communication protocols protocols give significantly better results (i.e., lower computa-
were thoroughly tested along with the reference communicati- tion time) than the reference SC-SV protocol.
on protocol in order to enable their comparison. The dependen-
cies of the protocols’ performance on the number of divided The numbers of messages transferred by the particular
traffic lanes, the vehicle density in these lanes, the size of the communication protocols are summarized in Table I. All
road traffic network, and the number of simulation working variants of both the LS and LSB protocols show significantly
processes was investigated, each parameter in a separate set of lower number of transferred messages than the reference SC-
tests. The results for the JUTS model only are presented, since SV protocol. The highest reduction of the number of transfer-
the utilized traffic model has little effect on the communication
protocol and the results are very similar.
All the tests were performed on a cluster called Hydra,
which was available at the DCSE UWB. The cluster consisted
of ten nodes interconnected with 1 Gbit Ethernet. Each node
incorporated Intel Xeon CPU at 3.2 GHz with 2 GB of RAM.
Each simulation process (one control process and from 2 to 8
working processes) resided on a node of the cluster. All
processes were single-threaded.
A. Dependency on the Number of Divided Traffic Lanes
For the dependency of the protocol’s performance on the
number of divided traffic lanes connecting the neighboring
sub-networks, the observed parameters were the computation
Fig. 9. Computation time dependent on the divided lanes count

129
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE I. MESSAGES COUNT DEPENDENT ON DIVIDED LANES COUNT TABLE II. MESSAGES COUNT DEPENDENT ON THE VEHICLE DENSITY
Lanes count 2 4 8 16 32 Vehicle density 0.05 0.10 0.15 0.20 0.25
Protocol Number of transferred messages Protocol Number of transferred messages
SC-SV 4353 4699 5083 6568 7954 SC-SV 4283 4440 4527 4653 4738
SC-LS 750 750 750 750 750 SC-LS 750 750 750 750 750
C-LS 500 500 500 500 500 C-LS 500 500 500 500 500
D-LS 250 250 250 250 250 D-LS 250 250 250 250 250
SC-LSB 750 750 750 750 750 SC-LSB 750 750 750 750 750
C-LSB 500 500 500 500 500 C-LSB 500 500 500 500 500
D-LSB 250 250 250 250 250 D-LSB 250 250 250 250 250

The numbers of messages transferred by the particular


red messages is achieved by the D-LS and D-LSB variants (up communication protocols are summarized in Table II. Unlike
to 97 %). the reference SC-SV protocol, which exhibits the increase of
B. Dependency on Vehicle Density the number of messages with the increasing vehicle density, all
variants of the LS and LSB protocol show a constant number
For the dependency on the vehicle density, the observed of messages. However, for the higher vehicle densities, the
parameters were again the computation time and the number of messages are longer, because they incorporate more vehicles.
messages. For the tests, a very small road traffic network with The highest reduction of the number of transferred messages is
only four crossroads was used (see Fig. 10). The limited size of again achieved by the D-LS and D-LSB variant (up to 95 %).
the road traffic network minimizes the influence of the increa-
sing vehicles count moving within the network on the compu- C. Dependency on the Road Traffic Network Size
tation time. This increase is necessary in order to achieve the For the dependency on the road traffic network size, the
increase of the vehicle density in the divided traffic lanes. observed parameters were the speedup and the number of
transferred messages. The speedup is used instead of the
computation time, because, for larger road traffic networks, the
time necessary for the inter-process communication is only a
small portion of the computation time. The speedup is
calculated as the ratio of the sequential computation time and
Fig. 10. The road traffic network used for the vehicle density testing the distributed computation time. For the tests, three road
traffic networks of various sizes were used. The networks were
The tests were performed the same way as the tests for the regular grids of 16, 64, and 256 crossroads (see Fig. 12). The
dependency on the number of divided lanes. Only the vehicle total length of the traffic lanes was 24000 meters, 86400
density in the divided traffic lanes was being changed instead meters, and 326400 meters, respectively.
of the number of traffic lanes. The vehicle density in these The tests were performed the same way as the tests for the
lanes ranged from 0.05 to 0.25 vehicles per time step (vpts). previous two dependencies. Moreover, ten sequential simulati-
The dependency of the computation time of the particular on runs were performed prior the distributed simulation. The
communication protocols on the vehicle density in divided traf- average computation time of this sequential runs were used for
fic lanes is depicted in Fig. 11. As can be seen, the computation the calculation of the speedup.
time increases with the increasing vehicle density in the divi- The dependency of the speedup of the simulation on the
ded traffic lanes. The reason for this trend is again the increa- road traffic network size is depicted in Fig. 13. As can be seen,
sing number of transferred vehicles and lane-blocks. Again, it the speedup increases with the increasing size of the road
is obvious that all variants of the LS and LSB protocols give traffic network, because a larger portion of the computation
significantly better results (i.e., lower computation time) than time is consumed by the simulation itself (mostly by the
the reference SC-SV protocol. movement of vehicles) and a smaller portion is consumed by
the inter-process communication. Again, all variants of the LS
and LSB protocols give significantly better results than the
reference SC-SV protocol. Only this time, the higher speedup
is better, which is inverse in comparison to the computation
time used in both previous sets of tests.

Fig. 11. Computation time dependent on the vehicle density


Fig. 12. The road traffic networks used for the network size testing

130
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 14. The division of road traffic network used for processes count testing

It should be noted that the achieved speedup values are


quite high. The best variants of the LS and LSB protocols
approach the border of the linear speedup for 2 and 4
simulation processes. The D-LS variant reached the speedup of
Fig. 13. Speedup dependent on the road traffic network size 3.64 for 4 working processes. The maximal achieved speedup
for 8 processes was 5.91. This value is still quite high, although
The numbers of messages transferred by the particular clearly sub-linear. It was reached by the D-LSB variant.
communication protocols are summarized in Table III. Again,
all variants of the LS and LSB protocols show constant number The reason for these results is not solely the efficiency of
of messages. These messages are significantly longer for the the communication protocols. There are also other influences,
larger road traffic networks, because there is a significant which increase the speedup, mostly linked to the utilization of
increase of the number of divided traffic lanes. This is also the the memory. Clearly, the memory requirements of each sub-
reason, why the number of messages sent by the reference SC- network are lower than the requirements of the entire road
SV protocol increases. The highest reduction of the number of traffic network. This can lead to the speedup of the
transferred messages is again achieved by the D-LS and D- computation of the simulation. For example, it has been
LSB variant (up to 96 %). determined that during a simulation run, the garbage collection
requires on average 505 ms in the simulation of one road traffic
D. Dependency on the Number of Working Processes sub-network, but 1499 ms in the simulation of entire road
For the dependency on the number of working processes, traffic network. This test was performed for the road traffic
(i.e., the number of sub-networks), the observed parameters network divided into 4 sub-networks. Other memory-related
were again the speedup and the number of transferred influences such as increased percentage of cache hits can also
messages. The tests were performed using a road traffic play a role.
network of constant size (regular grid of 256 crossroads – see
The numbers of messages transferred by the particular
Fig. 14), which was divided into 2, 4, and 8 sub-networks.
communication protocols are summarized in Table IV. The
The tests were performed the same way as the tests for the number of messages increases with the increasing number of
previous dependency. The dependency of the speedup of the working processes (i.e., road traffic sub-networks). This is
simulation on the number of working processes is depicted in caused by the increase of the number of neighboring working
Fig. 15, where the speedup increases with the increasing num- processes and increase of the number of divided traffic lanes
ber of working processes. This is an expected behavior since connecting them. The increase of the messages count is also
the increasing number of nodes of the Hydra cluster was used caused by the increase of the synchronization messages count.
for the computation. Again, all variants of the LS and LSB This is most apparent on the reference SC-SV protocol and the
protocols give better results than the reference SC-SV protocol. SC variants of the LS and LSB protocols. However, for the SC-
The difference becomes more apparent with the increasing LS and SC-LSB variants, the increase of the number of
number of working processes. The reason is that the intensity synchronization messages is considerably reduced by the
of the inter-process communication increases with the increa- utilization of the long step (unlike for the SC-SV protocol).
sing road traffic sub-networks (i.e., working processes) count
as there are more neighboring sub-networks and divided traffic
lanes. Thus, the portion of the computation time consumed by
the inter-process communication increases and the efficiency
of the LS and LSB protocol becomes more apparent.

TABLE III. MESSAGES COUNT DEPENDENT ON THE NETWORK SIZE


Crossroads count 16 64 256
Protocol Number of transferred messages
SC-SV 5078 5915 6414
SC-LS 750 750 750
C-LS 500 500 500
D-LS 250 250 250
SC-LSB 750 750 750
C-LSB 500 500 500
D-LSB 250 250 250 Fig. 15. Speedup dependent on the processes count

131
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE IV. MESSAGES COUNT DEPENDENT ON THE PROCESSES COUNT [8] R. M. Fujimoto, Parallel and Distributed Simulation Systems, John
Wiley & Sons, New York, 2000.
Working processes count 2 4 8
Protocol Number of transferred messages [9] Z. Fu, J. Yu, and M. Sarwat, “Demonstrating GeoSparkSim: A Scalable
Microscopic Road Network Tra?ic Simulator Based on Apache Spark,”
SC-SV 6429 12928 26686
in SSTD '19: Proceedings of the 16th International Symposium on
SC-LS 750 2000 4500 Spatial and Temporal Databases, August 2019, pp. 186-189.
C-LS 500 1000 2000
[10] B. Jiang and H. Zhang, “Realization of Distributed Traffic Simulation
D-LS 250 1000 2500
System with SCA and SDO,” in 2009 Second International Conference
SC-LSB 750 2000 4500
on Future Information Technology and Management Engineering,
C-LSB 500 1000 2000 December 2009, pp. 222-225.
D-LSB 250 1000 2500
[11] M. J. Lighthill and G. B. Whitham, “On kinematic waves II: A theory of
traffic flow on long crowed roads,” in Proceedings of the Royal Society
of London, Series A. Mathematical and Physical Sciences, vol. 229, No.
VII. CONCLUSION 1178, 1955, pp. 317–345.
In this paper, we described two efficient communication [12] W. Burghout, Hybrid microscopic-mesoscopic traffic simulation,
protocols for the distributed road traffic simulation – the Long Doctoral thesis, Royal Institute of Technology, Stockholm, 2004.
Step protocol (LS) and the Long Step Binary protocol (LSB), [13] K. Nagel and M. Schreckenberg, “A Cellular Automaton Model for
each with three variants. The protocols were thoroughly tested Freeway Traffic,” Journal de Physique I, 2, 1992, pp. 2221–2229.
and compared to a reference communication protocol (SC-SV), [14] P. G. Gipps, “A behavioural car following model for computer
simulation,” Transp. Res. Board, 15-B(2), 1981, pp. 403–414.
which represents protocols commonly used in existing
[15] R. Klefstad, Y. Zhang, M. Lai, R. Jayakrishnan, and R. Lavanya, “A
distributed road traffic simulators. The results indicate that the Scalable, Synchronized, and Distributed Framework for Large-Scale
developed protocols are able to reduce the number of Microscopic Traffic Simulation,” in The 8th International IEEE
transferred messages by up to 97 % in comparison to the Conference on Intelligent Transportation Systems, 2005, pp. 813-818.
reference communication protocol. The computation time is [16] M. Mastio, M. Zargayouna, G. Scemama, and O. Rana, “Two
reduced by up to 58 %. Using these protocols for simulation of distribution methods for multiagent traffic simulations,” Simulation
large road traffic networks, it is possible to achieve a Modelling Practice and Theory, vol. 89, 2018, pp. 35–47.
noticeable speedup. For four working processes performed on [17] T. Kiesling and J. Lüthi, “Towards Time-Parallel Road Traffic
four nodes of the distributed computer, we achieved speedup Simulation,” in Proceedings of the Workshop on Principles of Advanced
and Distributed Simulation (PADS’05), 2005, pp. 7-15.
up to 3.64. For eight simulation processes, the speedup of 5.91
[18] N. Cetin, A. Burri, and K. Nagel, “A Large-Scale Agent-Based Traffic
was achieved. Microsimulation Based on Queue Model,” in Proceedings of 3rd Swiss
Transport Research Conference, 2003.
In our future work, we will focus on further improvements
of the communication protocols. We will also investigate the [19] T. Potuzak and P. Herout, “Use of Distributed Traffic Simulation in the
JUTS Project,” in Proceedings of EUROCON 2007, September 2007,
possibility of the utilization of the developed protocols for pp. 2250-2255.
other (i.e., non-road-traffic) distributed simulations. [20] Y. Xu, V. Viswanathan, and W. Cai, “Reducing Synchronization
Overhead with Computation Replication in Parallel Agent-Based Road
REFERENCES Traffic Simulation,” IEEE Transactions on Parallel and Distributed
[1] K. Nagel and M. Rickert, “Parallel Implementation of the TRANSIMS Systems, vol. 28, No. 11, 2017, pp. 3286–3297.
Micro-Simulation,” Parallel Computing, vol. 27, No. 12, 2001, pp. [21] K. Ramamohanarao, H. Xie, L. Kulik, S. Karunasekera, E. Tanin, R.
1611–1639. Zhang, and E. B. Khunayn, “SMARTS: Scalable Microscopic Adaptive
[2] D. Igbe, N. Kalantery, S. Ijaha, and S. Winter, “An Open Interface for Road Traffic Simulator,” ACM Transactions on Intelligent Systems and
Parallelization of Traffic Simulation,” in Proceedings of the Seventh Technology, vol. 8, No. 2, Article 26, 2016, pp. 1-22.
IEEE International Symposium on Distributed Simulation and Real- [22] M. S. Ahmed and M. A. Hoque, “Partitioning of Urban Transportation
Time Applications (DS-RT’03), October 2003, pp. 158-163. Networks Utilizing Real-World Traffic Parameters for Distributed
[3] D. Wei, W. Chen, and X. Sun, “An Improved Road Network Partition Simulation in SUMO,” in 2016 IEEE Vehicular Networking Conference
Algorithm for Parallel Microscopic Traffic Simulation,” in 2010 (VNC), December 2016.
International Conference on Mechanic Automation and Control [23] A. Ventresque, Q. Bragard, E. S. Liu, D. Nowak, L. Murphy, G.
Engineering, June 2010, pp. 2777–2782. Theodoropoulos, and Q. Liu, “SParTSim: A Space Partitioning Guided
[4] Y. Xu and G. Tan, “An Offline Road Network Partitioning Solution in by Road Network for Distributed Traffic Simulations,” in 2012
Distributed Transportation Simulation,” in 2012 IEEE/ACM 16th IEEE/ACM 16th International Symposium on Distributed Simulation
International Symposium on Distributed Simulation and Real Time and Real Time Applications – DS-RT 2012, October 2012, pp. 202-209.
Applications – DS-RT 2012, October 2012, pp. 210–217. [24] Y. Xu, H. Aydt, and M. Lees, “SEMSim: A Distributed Architecture for
[5] T. Potuzak, “Distributed-Parallel Road Traffic Simulator for Clusters of Multi-scale Traffic Simulation,” in 2012 ACM/IEEE/SCS 26th
Multi-core Computers,” in 2012 IEEE/ACM 16th International Workshop on Principles of Advanced and Distributed Simulation, July
Symposium on Distributed Simulation and Real Time Applications - 2012, pp. 178-180.
DS-RT 2012, October 2012, pp. 195–201. [25] D. Hartman, “Leading Head Algorithm for Urban Traffic Model,” in
[6] T. Potuzak and P. Herout, “An Efficient Communication Protocol for Proceedings of the 16th International European Simulation Symposium
Distributed Traffic Simulation: Introduction of the Long Step Method,” ESS, pp. 297-302, 2004.
in Sofsem 2009: Theory and Practice of Computer Science, Proceedings, [26] P. T. R. Wang and W. P. Niedringhaus, “Distributed/Parallel Traffic
Volume II, January 2009, pp. 72-83. Simulation for IVHS Application,” in Proceedings of the 25th Winter
[7] T. Potuzak, “Distributed and Centralized Version of an Efficient Conference on Simulation, 1993, pp. 1225-1230.
Communication Protocol for Distributed Traffic Simulation,” in
International Conference on Computer Modelling and Simulation
(CSSim 2009), September 2009, pp. 259-264.

132
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

6LPXODWLQJ+HWHURJHQHRXV0RGHOVRQ0XOWL&RUH
3ODWIRUPVXVLQJ-XOLD¶V&RPSXWLQJ/DQJXDJH3DUDOOHO
3RWHQWLDO


6HUJH\6XVORY 0LFKDHO6FKLHN 0DUNXV5REHQV &KULVWLDQ*UHZLQJ


=($(OHFWURQLF6\VWHPV =($(OHFWURQLF6\VWHPV =($(OHFWURQLF6\VWHPV =($(OHFWURQLF6\VWHPV
5HVHDUFK&HQWHU-OLFK 5HVHDUFK&HQWHU-OLFK 5HVHDUFK&HQWHU-OLFK 5HVHDUFK&HQWHU-OLFK
*HUPDQ\ *HUPDQ\ *HUPDQ\ *HUPDQ\
66XVORY#I]MXHOLFKGH RUFLGRUJ; 05REHQV#I]MXHOLFKGH &*UHZLQJ#I]MXHOLFKGH
  
6WHIDQYDQ:DDVHQ   
=($(OHFWURQLF6\VWHPV  
5HVHDUFK&HQWHU-OLFK  
*HUPDQ\  
6YDQ:DDVHQ#I]MXHOLFKGH  


$EVWUDFW ² 7KLV SDSHU DGGUHVVHV WKH TXHVWLRQV RI KLJKOHYHO PDWKHPDWLFDOPRGHOOLQJSRVHVVSHFLDOUHTXLUHPHQWVWRWKHWRROV
V\VWHP PRGHOOLQJ XVLQJ KHWHURJHQHRXV PXOWLWRRO PRGHOOLQJ GHPDQGLQJ WKH HOHJDQFH ZLWK ZKLFK WKH PDWKHPDWLFDO
HQYLURQPHQW RQ SDUDOOHO PXOWLFRUH SURFHVVLQJ V\VWHPV IRU DEVWUDFWLRQFDQEHH[SUHVVHG
VLPXODWLRQDFFHOHUDWLRQ7KHPRGHOOLQJWHFKQLTXHKDVEHHQDSSOLHG
IRU KLJKOHYHO YDOLGDWLRQ RI D KLJKSUHFLVLRQ ,QGRRU 3RVLWLRQLQJ 7KH FRPSOH[LW\ RI WKH PRGHUQ V\VWHPV SRVHV D VSHFLDO
6\VWHP IRU 0RWLRQ $QDO\VLV ,360$  GHYHORSHG LQ WKH &HQWUDO GHPDQGWRWKHVSHHGRIPRGHOVLPXODWLRQ7KHKLJKHUVSHHGFDQ
,QVWLWXWH (OHFWURQLF 6\VWHPV =($  RI WKH 5HVHDUFK &HQWHU EH DFKLHYHG HLWKHU E\ LQFUHDVLQJ WKH SHUIRUPDQFH RI D VLQJOH
-XHOLFK *PE+ 7KH KHWHURJHQHRXV PRGHOOLQJ HQYLURQPHQW KDV SURFHVVRURUE\SDUDOOHOL]LQJWKHH[HFXWLRQRIDPRGHO'XHWRD
EHHQ EXLOW XVLQJ DQ LPSOHPHQWDWLRQOHYHO PRGHO GHVLJQHG LQ QDWXUDO OLPLWDWLRQ RI WKH PRGHUQ WHFKQRORJ\ WKH ODWWHU LV RIWHQ
0DWODE6LPXOLQNDYHULILFDWLRQPRGHOIRUGHVFULELQJWKHV\VWHP SUHIHUDEOH QRZDGD\V LI D JRRG VFDODELOLW\ RI D PRGHO LV
HQYLURQPHQW XVLQJ 0RGHOLFD ODQJXDJH DQG -XOLD ODQJXDJH IRU DFKLHYDEOH +RZHYHU GXH WR WKH FRPSOH[LW\ LQ RUJDQL]LQJ
DXWRPDWLF JHQHUDWLRQ RI ELQGLQJ PRGHOOLQJ HQYLURQPHQW DQG SDUDOOHO FRPSXWDWLRQ WKH PRGHOHUV RIWHQ QHJOHFW WKLV
SDUDOOHOL]LQJ WKH VLPXODWLRQ RI WKH RYHUDOO PRGHO 7KH DSSURDFK RSSRUWXQLW\ WKXV SRVLQJ D VSHFLDO GHPDQG WR WKH PRGHUQ
VKRZHGDJRRGIOH[LELOLW\LQV\VWHPGHVFULSWLRQDQGYHULILFDWLRQLQ PRGHOOLQJ PHDQV LQ RUJDQL]LQJ WKH SDUDOOHO FRPSXWLQJ DV
WKH PXOWLLQVWUXPHQW PRGHOOLQJ HQYLURQPHQW DQG D JRRG VHDPOHVV DV SRVVLEOH 2Q WKH RWKHU KDQG V\VWHPOHYHO GHVLJQ
SHUIRUPDQFHJDLQGXHWRVLPXODWLRQSDUDOOHOLVP RIWHQ UHTXLUHV WKH FRRSHUDWLRQ RI VSHFLDOLVWV IURP GLIIHUHQW
VFLHQWLILF DQG DSSOLFDWLRQ ILHOGV 2IWHQ WKHVH WHDP PHPEHUV
Keywords—distributed simulation, high level system modelling,
functional verification, Julia computing language, Modelica,
FDQQRWXVHDXQLILHGWRROVHWGXHWRVSHFLILFGHPDQGVRQ WKHLU
parallelized simulation SURIHVVLRQDODUHDV$GGLWLRQDOO\WKHYDULHW\LQWKHWRROVHWPD\
EH FDXVHG E\ SURSULHWDU\ UHDVRQV 7KHVH UHDVRQV LQFOXGH
, ,1752'8&7,21 DFTXLUHGWKLUGSDUW\LQWHOOHFWXDOSURSHUW\FRUHVOLFHQVLQJSROLF\
RUFRVWV IRU FRPPHUFLDO WRROVOHJDF\ LQ SUHYLRXVO\ GHYHORSHG
0RGHUQFRPSXWLQJDQGFRQWUROOLQJV\VWHPVDUHFRPSOH[DQG
UHXVDEOHFRGHEDVHZLWKLQWKHWHDPRUDVFLHQWLILFVRFLHW\7KXV
UHTXLUHFRPSUHKHQVLYHPXOWLDVSHFWPRGHOOLQJIURPWKHKLJKHU
V\VWHP PRGHOOLQJ UHTXLUHV VSHFLDO LQVWUXPHQWV IRU FUHDWLQJ
DEVWUDFWLRQ OHYHO GRZQ WR LPSOHPHQWDWLRQ DQG PDQXIDFWXULQJ
KHWHURJHQHRXVPRGHOOLQJHQYLURQPHQWDQGSRVHVKLJKGHPDQG
7KLVWRSGRZQDSSURDFKDOORZVSUHYHQWLQJFRVWO\GHVLJQHUURUDW
RQLQWHURSHUDELOLW\EHWZHHQWKHWRROVLQWKHWRROVHW
WKHHDUO\VWDJHVRIGHYHORSPHQW
7KHPRGHOOLQJSURFHGXUHSUHVHQWHGLQWKLVSDSHUDGGUHVVHV
)XQFWLRQDOGHVFULSWLRQRIWKHV\VWHPXQGHUGHVLJQ 6X' LV
WKHVH FKDOOHQJHV E\ EXLOGLQJ D KHWHURJHQHRXV PRGHOOLQJ
RQH RI WKH NH\ KLJKOHYHO DVSHFWV LQ GHYHORSPHQW SURFHVV
HQYLURQPHQWWKDWFRPELQHVWZRSRSXODUPRGHOOLQJLQVWUXPHQWV
'HVFULELQJWKHV\VWHPIXQFWLRQDOLW\LVQRUPDOO\FDUULHGRXWE\
0DWKZRUNV6LPXOLQNDQG0RGHOLFDE\ELQGLQJWKHPZLWK-XOLD
FRPSXWHUH[HFXWDEOH PRGHOOLQJ XVLQJ ODQJXDJHEDVHG RU
FRPSXWLQJODQJXDJHWKDWVSHFLDOO\DGGUHVVHVWKHSUREOHPDWLFVRI
GLDJUDPEDVHGWRROVZKLFKFDQRIIHUDKLJKOHYHORIDEVWUDFWLRQ
WRRO LQWHURSHUDELOLW\ DQG LQWHQVLYH H[SORLWLQJ RI FRPSXWLQJ
7KLVPRGHOOLQJPLPLFVLQWHUDFWLRQRIWKHV\VWHPZLWKWKHRXWHU
SDUDOOHOLVP
HQYLURQPHQWYLDWKHV\VWHP¶VLQWHUDFWLRQSRLQWVLHVHQVRUVDQG
DFWXDWRUV DQG WKH UHDFWLRQ SURFHVVHV LQ WKH V\VWHP WR WKH :H DSSOLHG WKLV DSSURDFK LQ WKH GHYHORSPHQW RI D KLJK
UHFHLYHGRXWHUVWLPXOL7KHSURFHVVHVDWWKHLQWHUDFWLRQSRLQWVDUH SUHFLVLRQUDGLRIUHTXHQF\EDVHG,QGRRU3RVLWLRQLQJ6\VWHPIRU
RIWHQRIDSK\VLFDOQDWXUH7KLVLVZK\WKH\FDQEHEHVWGHVFULEHG KXPDQ 0RYHPHQW $QDO\VLV ,360$  IRU YHULI\LQJ WKH
XVLQJ WKH ODQJXDJH RI PDWKHPDWLFV 7KLV KLJKOHYHO VSHFLDOO\ GHVLJQHG LQWHJUDWHG FLUFXLW ,365) ,& DV WKH PRVW

‹,(((

133
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

FULWLFDOSDUWRIWKH,360$7KH,&GHYHORSPHQWFRQFHQWUDWHG VWUHDP 7KH '63 LQFOXGHV FRPE ILOWHU IRU VHSDUDWLQJ WKH %6
RQ WKH HQKDQFHPHQW RI SUHYLRXVO\ UHDOL]HG ,36 FRQFHSW >@ FKDQQHOVDQGWZREORFNVIRUVKLIWHGXQGHUVDPSOLQJRIWKHGDWD
IROORZLQJDQHZPHWKRGIRUSRVLWLRQGHWHUPLQDWLRQ>@ DQGH[WUDFWLQJDPSOLWXGHDQGSKDVHLQIRUPDWLRQUHVXOWLQJLQWKH
7'R$ LQIRUPDWLRQ >@ 7KH EDFNHQG IDFHV DQRWKHU GLJLWDO
,Q WKH QH[W VHFWLRQV ZH EULHIO\ JLYH DQ RYHUYLHZ RI WKH KDUGZDUHEORFNIRUIXUWKHUGDWDSDFNDJLQJDQGGDWDVWUHDPLQJWR
DSSOLFDWLRQEDFNJURXQGDQGWKH,365),&GHVLJQIROORZHGE\D D %7 WUDQVPLWWHU $V WKH PRVW FKDOOHQJLQJ GDWDSURFHVVLQJ
VKRUWGHVFULSWLRQRIWKHXVHGWRROVHWDQGVXEVHTXHQWH[SODQDWLRQ HOHPHQWRIWKH,365),&GHVLJQ'63PRGXOHUHTXLUHVDVSHFLDO
RI WKH KHWHURJHQHRXV PRGHOLQJ HQYLURQPHQW LPSOHPHQWDWLRQ FDUHLQIXQFWLRQDOYHULILFDWLRQ
EDVHGRQWKHVHWRROV7KHUHVXOWVDQGRXWORRNDUHJLYHQLQWKHODVW
VHFWLRQ
,, $33/,&$7,21%$&.*5281'



)LJ 6FKHPHRIWKH'63KDUGZDUHPRGXOHZLWKLQWKH,365)>@

,,, $33/,('722/6(7
A. Mathworks Simulink
6LPXOLQN LVJUDSKLFDO EORFNEDVHG PRGHOOLQJWRROEXLOG RQ
WRSRIWKH0$7/$%HQJLQH,WLVEURDGO\XVHGIRUPRGHOEDVHG
GHVLJQ RI DXWRPDWLF FRQWURO DQG GLJLWDO VLJQDO SURFHVVLQJ
V\VWHPV7KHPRGHOVDUHSULPDULO\EXLOGYLDDJUDSKLFDOLQWHUIDFH
E\ FUHDWLQJ VWUXFWXUDO GLDJUDPV IURP D FXVWRPL]DEOH VHW RI
IXQFWLRQDOEORFNOLEUDULHV6LPXOLQNKDVDQXPEHURIDGGRQIRU
DXWRPDWHG WUDQVLWLRQ IURP D YLUWXDO SURWRW\SH WR DQ
LPSOHPHQWDWLRQ PRGHO E\ JHQHUDWLQJ HLWKHU D SURGXFWLRQOHYHO
&FRGHRUDV\QWKHVL]DEOH+'/GHVFULSWLRQXVHGLQWKHKDUGZDUH
FKLSGHVLJQ
 B. Modelica Modelling Language
)LJ ,OOXVWUDWLRQRIWKH,QGRRU3RVLWLRQLQJ6\VWHPIRU0RYHPHQW$QDO\VLV 0RGHOLFD LV D FRPSRQHQWRULHQWHG GHFODUDWLYH PXOWL
,360$  FRPSULVLQJ RI WKH %DVH 6WDWLRQ %6  GHILQLQJ WKH 0RQLWRULQJ GRPDLQ IUHH PRGHOOLQJ ODQJXDJH GHYHORSHG E\ WKH 0RGHOLFD
6SDFH 06 WKH0RELOH'HYLFHV 0' GHWHUPLQLQJWKH7'R$LQIRUPDWLRQDQG
3URFHVVLQJ8QLWIRUUHFRQVWUXFWLRQWKHSRVLWRQVRIWKH0'EDVHGRQ7'R$GDWD
$VVRFLDWLRQ >@ 7KH ODQJXDJH LV GHVLJQHG IRU GHVFULELQJ
FRPSOH[ V\VWHPV
 G\QDPLFV LQ FRQWLQXRXV DQG GLVFUHWH WLPH
7KH ,360$ )LJ   WUDFNV KXPDQ PRYHPHQWV LQ WKH GRPDLQKDYLQJLQWULQVLFQRWLRQRIPRGHOOLQJWLPHDQGVWUXFWXUH
0RQLWRULQJ 6SDFH 06  RI    Pñ ZLWK VSDWLDO DQG 6\VWHPVDUHUHSUHVHQWHGDVDVWUXFWXUDOKLHUDUFKLFDOQHWZRUNRI
WHPSRUDO UHVROXWLRQ RI PP DQG PV LQ UHDOWLPH )RU FRQQHFWHG IXQFWLRQDO FRPSRQHQWV GHVFULELQJ WKH HOHPHQW
UHFRQVWUXFWLRQ RI WKH OLPE PRYHPHQWV XS WR  PLQLDWXUH G\QDPLFVDQGLQWHUDFWLRQZLWKWKHRXWHUHOHPHQWV7KHODQJXDJH
0RELOH 'HYLFHV 0'  DWWDFKHG WR WKH OLPEV FDQ EH WUDFNHG DQG D YDVW QXPEHU RI GRPDLQVSHFLILF OLEUDULHV HQDEOHV FURVV
VLPXOWDQHRXVO\ 3RVLWLRQ HVWLPDWLRQ LV EDVHG RQ WKH 7LPH GRPDLQ PRGHOOLQJ LQ RQH FRPSOH[ PRGHO ,W VXSSRUWV ERWK
'LIIHUHQFHRI$UULYDO 7'R$ RIYLUWXDOHYHQWVFRGHGLQKLJK WH[WXDO DQG VWUXFWXUDO GLDJUDP HQWU\ FRPELQLQJ IOH[LELOLW\ DQG
IUHTXHQF\ UDGLRZDYH VLJQDOV  *+]  IURP  WR  %DVH VSHHGRIPRGHOFUHDWLRQ7KHODQJXDJHH[SORLWVWKHFRQFHSWVRI
6WDWLRQV %6  WR WKH 0' %LQDU\ 3KDVH6KLIW .H\LQJ %36.  REMHFWRULHQWHG SURJUDPPLQJ E\ RIIHULQJ PHFKDQLVP RI
ZLWK0+]PRGXODWLRQDVDELSRODUFORFNVLJQDOLVXVHGIRU FRPSRQHQW FODVV  LQKHULWDQFH SDUDPHWHUL]HG FRPSRQHQW
FRGLQJ$Q0'UHFHLYHVWKHVLJQDOVIURPWKH%6VWKURXJKWKH LQVWDQWLDWLRQ FRQGLWLRQDO LQVWDQWLDWLQJ RU LQVWDQFH UHSOLFDWLRQ
,365),&7KH,&JHQHUDWHVHLJKWGDWDVWUHDPVDVVRFLDWHGZLWK 7KH G\QDPLFV RI D FRPSRQHQW LV GHVFULEHG E\ D V\VWHP RI
WKH %6V IURP ZKLFK VHYHQ 7'R$ VWUHDPV DUH FDOFXODWHG 7KH DFDXVDOHTXDWLRQVVLPLODUWRSXUHIXQFWLRQDOODQJXDJHVDVZHOO
,365) ,& ZLWKLQ WKH 0' DFWV DV D VLJQDO SUHSURFHVVRU 7KH DV E\ HOHPHQWV RI LPSHUDWLYH SURJUDPPLQJ )RU LQWHU
0' WUDQVPLWV WKH WHPSRUDULO\ VRUWHG 7'R$ SDFNDJHV WR WKH FRPSRQHQW FRPPXQLFDWLRQ YLD FRQQHFWLRQV WKH ODQJXDJH UHO\
VWDWLRQDU\ 3URFHVVLQJ 8QLW 38  RYHU %OXHWRRWK %7  7KH RQWKHFRQFHSWRITXDQWLW\IORZDQGFRQVHUYDWLRQODZVPDNLQJ
SRVLWLRQUHFRQVWUXFWLRQLVSHUIRUPHGLQWKH38 WKHERXQGVEHWZHHQWKHFRPSRQHQWVELGLUHFWLRQDO
7KH FHQWUDO HOHPHQW RI WKH ,365) ,& LV D '63 KDUGZDUH C. Julia Computing Language
PRGXOHIRURQIO\H[WUDFWLQJRIWKH7'R$LQIRUPDWLRQIURPWKH -XOLD LV D XQLYHUVDO ODQJXDJH IRU VFLHQWLILF FRPSXWDWLRQ
UHFHLYHGVXSHUSRVHG%6VLJQDOV7KHIURQWHQGSUHSURFHVVHVWKH UHO\LQJRQSULQFLSOHRIIXQFWLRQDOSURJUDPPLQJ>@-XOLDDGDSWV
DQDORJGDWDE\DPSOLI\LQJDQGGRZQFRQYHUWLQJLWWRQHDU]HUR RQIO\ FRPSXWDWLRQ YLD -,7 PHFKDQLVP DQG H[SORLWV KDUGZDUH
IUHTXHQF\ ILOWHULQJ RXW QRLVH DQG EORFNHU VLJQDOV DQG $'& EHQHILWV VHDPOHVVO\ UHO\LQJ RQ /RZ/HYHO 9LUWXDO 0DFKLQH
VDPSOLQJWRVXSSO\WKH'63 )LJ ZLWKDGLJLWL]HG,4GDWD //90IRUKDUGZDUHDEVWUDFWLQJ,WVVFULSWLQJFDSDELOLW\HQDEOHV

134
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

XV WR DXWRPDWLFDOO\ ELQG VWUXFWXUDO EORFNV RI PRGHOV ZKLFKFDQEHODWHUHDVLO\UHSURGXFHGE\DURERWL]HGWHVWLQJVHWXS


LPSOHPHQWHGLQGLIIHUHQWPRGHOOLQJWRROV-XOLDDOORZVERRVWLQJ LQ WKH WHVWLQJ SKDVH 7KH VWUXFWXUDO EORFNEDVHG PRGHOOLQJ
VLPXODWLRQ RI WKH RYHUDOO PRGHO E\ VXSSRUWLQJ EURDG UDQJH RI DSSURDFK RI 6LPXOLQN ODFNV IOH[LELOLW\ DQG FRQILJXUDELOLW\ WR
FRPSXWDWLRQSDUDOOHOLVPIURP6,0'XSWRFOXVWHUV\VWHPV>@ VDWLVI\WKHDERYHPHQWLRQHGUHTXLUHPHQWV7KHODQJXDJHEDVHG
7KH PHFKDQLVPV DQG LQVWUXPHQWV IRU RUJDQL]LQJ WKH GLIIHUHQW PRGHOOLQJ ZLWK 0RGHOLFD KDV EHHQ FKRVHQ WR EXLOG XS WKH
W\SHV RI SDUDOOHO SURFHVVLQJ DUH VLPLODU DQG XQLILHG ZKLFK PRYHPHQWPRGHO7KLVYHULILFDWLRQSURFHGXUHUHTXLUHVDVSHFLDO
DOORZVPLQLPDOHIIRUWVIRUPLJUDWLQJIURPRQHSDUDOOHOSODWIRUP PHDQVIRULQWHJUDWLQJWKH6X'ZLWKWKH9HULILFDWLRQ0RGHO 90 
WRDQRWKHU$GGLWLRQDOO\WKHPDWKRULHQWHGKLJKOHYHOV\QWD[RI GXH WR LQFRQVLVWHQF\ RI PRGHOLQJ LQVWUXPHQWV $ W\SLFDO
WKH ODQJXDJH DQG ZHOOGHYHORSHG IXQFWLRQ OLEUDULHV DOORZ DSSURDFK WR FURVV('$ PRGHO LQWHJUDWLRQ LV PRGHO H[FKDQJH
HIILFLHQWUHVXOWDQDO\VLVDQGLQWHUPHGLDWHGDWDKDQGOLQJZLWKRXW XVLQJ)XQFWLRQDO0RFNXS8QLW )08 >@ZKLFKGHVFULEHVWKH
ORZHULQJ WKH DEVWUDFWLRQ OHYHO -XOLD EHQHILWV IURP IOH[LEOH VWUXFWXUHRIWKHPRGHOXVLQJDQ;0/QRWDWLRQDQGWKHPRGHO¶V
LQWHURSHUDELOLW\ ZLWKRWKHUODQJXDJHVE\WKHLQWHUQDO PHDQVRI IXQFWLRQDOLW\ XVLQJ & 1RQHWKHOHVV WKH DERYHPHQWLRQHG
WKH //90 HJ FDOOLQJ & RU )RUWUDQ FRGH RU YLD D VHW RI WHFKQLTXHVXIIHUVIURPLQDELOLW\RI('$VWRIXOO\GLVFRYHUWKH
SDFNDJHVIRULQWHJUDWLRQZLWKHJ3\WKRQQ9LGLD&8'$>@RU SDUDOOHOL]DWLRQ SRWHQWLDO RI WKH PRGHOV DXWRPDWLFDOO\ $ PRUH
2SHQ&/ >@ $W WKH VDPH WLPH /LVSW\SH H[SUHVVLRQEDVHG IOH[LEOH VROXWLRQ FDQ EH D PDQDJHDEOH &FRGH UHWULHYHG HLWKHU
V\QWD[>@DQGQDWLYH8QLFRGHVXSSRUWDOORZV-XOLDWRH[WHQGLWV IURP DQ )08 RU XVLQJ DQ DXWRPDWHG FRGH JHQHUDWLRQ LI WKLV
V\QWD[ WR 'RPDLQ 6SHFLILF /DQJXDJHV '6/V  IRU EHWWHU RSWLRQLVDYDLODEOHLQDQ('$
DGRSWLQJVLPXODWLRQLQVWUXPHQWVWRDWDUJHWPRGHOOLQJGRPDLQ
B. DSP-Block Implementation Model (SIMULINK)
,9 +(7(52*(1(28602'(/,03/(0(17$7,21 0DWKZRUN 6LPXOLQN LV XVHG IRU FUHDWLQJ DQ HQJLQHHULQJ
PRGHO DV D IORDWLQJSRLQW DSSUR[LPDWLRQ IRU '63DOJRULWKP
DQDO\VLVDQGGDWDSUHFLVLRQDVVHVVPHQWWKDWODWHULVUHILQHGWRDQ
LPSOHPHQWDWLRQ PRGHO UHIOHFWLQJ SDUWLFXODU GHVLJQ VROXWLRQV
ZLWK WKH FORFNF\FOH DQG GDWDIRUPDW DFFXUDF\ 7KLV PRGHO LV
WKHQ SDUVHG E\ WKH DGGRQV IRU DXWRPDWLF JHQHUDWLRQ RI
IXQFWLRQDOO\ LGHQWLFDO FRGH 9+'/FRGH DV LQSXW IRU IXUWKHU
V\QWKHVLVIRUFXVWRP,&LPSOHPHQWDWLRQDQG&VRXUFHILOHVXVHG
DV DQ H[HFXWDEOH LPSOHPHQWDWLRQ PRGHO RI WKH '63EORFN IRU
IXQFWLRQDOYHULILFDWLRQ
 C. DSP-Block Verification Model (Modelica)
)LJ 6FKHPHRIWKHVLPXODWLRQPRGHOHOHPHQWVIRUIXQFWLRQDOYHULILFDWLRQRI 7KH'639HULILFDWLRQ0RGHOFRQVLVWVRIWZRSDUWV7KHILUVW
WKH'63%ORFN7KHGLJLWDOVLJQDOSURFHVVLQJSDUWLVSHUIRUPHGZLWKLQWKH'63 UHSUHVHQWVWKHSK\VLFDOHQYLURQPHQWLQWKH06E\JHQHUDWLQJD
%ORFN,PSOHPHQWDWLRQ0RGHO XSSHUULJKWEORFN ZKLOHWKHVXSHUSRVLWLRQRIWKH
UDGLRVLJQDOVHPLWWHGE\WKH%6VDQGUHFHLYHGDORQJWKHWUDMHFWRU\RIDQ0'LV
GDWDVWUHDP RI WKH DPSOLWXGH RI HOHFWURPDJQHWLF ILHOG DW D
PRGHOHG ZLWKLQWKH '63%ORFN 9HULILFDWLRQ 0RGHO 90  XSSHU OHIW EORFN  FHUWDLQSRVLWLRQRQDQ0'PRYHPHQWWUDMHFWRU\7KHVHFRQGDFWV
7KHSRVLWLRQDOGDWDRIWKH0'WUDMHFWRU\VHUYHDVDJURXQGWUXWK ORZHUEORFN  DV WKH YLUWXDO $QDORJ )URQWHQG RI DQ 0' WR WKH '63EORFN
IRUWKHIXQFLRQDOYHULILFDWLRQ UHFHLYLQJ WKH GDWDVWUHDP WKURXJK WKH LQWHUDFWLRQ SRLQW DQG
JHQHUDWLQJ D GLJLWL]HG GRZQPL[HG ,4 FRPSRQHQWV RI WKH
A. Verification Concept UHFHLYHG5)VLJQDOIRUWKH'63EORFNDVLWVRXWSXW
7KH '63EORFN LPSOHPHQWHG LQ 6LPXOLQN LV VHHQ DV D 7KH06PRGHOFRQVLVWVRIWKHIROORZLQJPDMRUFRPSRQHQWV
EODFNER[ PRGHOZLWK D IURQWHQG DQG DEDFNHQG VLJQDOOHYHO DVLQVWDQFHVRIWKH0RGHOLFDFODVVHV
LQWHUIDFH )XQFWLRQDO YHULILFDWLRQ RI WKH '63EORFN 6X'  LV
SHUIRUPHGDVVKRZQLQ)LJ x 06FRQILJXUDWLRQZKLFKLQFOXGHVVSDWLDOSRVLWLRQVRIWKH
%6V DQG ORFDWLRQV RI REVWDFOH REMHFWV LQ WKH 06 IRU
x WKH '63EORFN LV IHG ZLWK DUWLILFLDOO\JHQHUDWHG GDWD LPLWDWLQJLQGLYLGXDOVLJQDOVKDGRZLQJLQWKHVFHQH
VDPSOHV VWLPXOL  WKDW FRUUHVSRQG WR WKH SUHGHILQHG
SRVLWLRQRIDQ0'LQWKH0RQLWRULQJ6SDFH 06  x WUDMHFWRU\ RI D 0' V  LQ DQ DQDO\WLFDO IRUP DQG WKH
PRYHPHQW SDUDPHWHUV WKH PRGHO H[SORLWV WKH REMHFW
x WKH RXWSXWV RI WKH '63 LV WKHQ SDVVHG WR WKH RULHQWHGIHDWXUHRIWKHODQJXDJHIRUFODVVLQKHULWDQFHVR
UHFRQVWUXFWLRQ DOJRULWKP WR GHULYH WKH SRVLWLRQDO WKDW LQ WKH VDPH 06 PRGHO WUDMHFWRULHV RI GLIIHUHQW
LQIRUPDWLRQJHQHUDWHGE\WKH'63EORFNE\FRPSDULQJLW JHRPHWU\ ± OLQHDU URWDWLRQDO RU FRPELQHG ± PD\ EH
WRWKHUHIHUHQFHGLQSXWSRVLWLRQDOGDWD LQVWDQWLDWHGZLWKRXWPRGHOUHEXLOGLQJ
)RU WKH SXUSRVH RI WKH VSDWLDO DQG WHPSRUDO UHVROXWLRQ x %6 UHSUHVHQWLQJ %36. VLJQDO PRGXODWLRQ ZLWK
YDOLGDWLRQWKHSUHGHILQHGSRVLWLRQVDUHFRQVHTXHQWO\ORFDWHGRQ LQGLYLGXDOFDUULHUVLJQDOSDUDPHWHUV
D FHUWDLQ WUDMHFWRU\ LPLWDWLQJ WKH PRYHPHQW RI DQ 0' ZLWK D
FHUWDLQYHORFLW\7KLVDSSURDFKDOORZVYHULILFDWLRQRIWKHGHVLJQ x 0' RIIHULQJ WKH UD\WUDFLQJ IXQFWLRQDOLW\ LQFOXGLQJ
XQGHU FULWLFDO FRQGLWLRQV FRUQHU FDVHV  E\ YDU\LQJ WKH 0'¶V GLVWDQFHDQGDQJXODUVL]HVFRPSXWDWLRQRIWKHREMHFWVLQ
YHORFLW\ LWV SUR[LPLW\ WR %6V DQG VLJQDO DWWHQXDWLRQ GXH WR WKH06VFHQH IURP WKH SURVSHFWLYHRI WKH 0' 7KLVLV
SRVVLEOHVKDGRZLQJRIWKHLQGLYLGXDO%6V¶5)VLJQDOV)RUWKH XVHGIRUUHFRQVWUXFWLQJWKHDPSOLWXGHRIHDFKLQGLYLGXDO
HDVHRIWKHYHULILFDWLRQWKHPRGHOHGWUDMHFWRULHVDUHGHVFULEHGLQ VLJQDO DQG DVVXPLQJ WKH LQWHUIHUHQFH RI WKH REVWDFOH
DQDQDO\WLFDODOJHEUDLFIRUP ZLWK PRYHPHQWSDUDPHWHUL]DWLRQ REMHFWV LQ WKH VFHQH WKH 06 PRGHO PD\ FRQWDLQ PRUH

135
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

WKDQRQH 0' IRUWKHSXUSRVHRIFRQMXJDWHG PRYHPHQW HIILFLHQWIXQFWLRQDOYHULILFDWLRQ$GGLWLRQDOO\WKHUHZDVDQHHG


DQDO\VLV IRUPRUHIOH[LEOHJHQHUDWLRQRIWKHWUDMHFWRU\7RRYHUFRPHWKHVH
OLPLWDWLRQVZHVSOLWWKHVLPXODWLRQLQWZRPRGHOSDUWV:HXVH
7KH $QDORJ )URQWHQG PRGHO FRQWDLQV D FKDLQ RI EORFN WKH FXVWRP '63 EORFN GHVLJQHG LQ 0DWKZRUN 6LPXOLQN ('$
LQVWDQFHVIRUWKH%36.5)VLJQDO,4FRPSRQHQWVVHSDUDWLRQDQG RULHQWHG WR DQDO\VLV DQG DXWRPDWHG LPSOHPHQWDWLRQ LQ VLJQDO
GRZQPL[LQJ WKH VLJQDO WR DQ LQWHUPHGLDWH IUHTXHQF\ ZLWK SURFHVVLQJ HQJLQHHULQJ GRPDLQ )RU PRGHOOLQJ V\VWHP
VXEVHTXHQW DQDORJ ILOWHULQJ DQG YDOXH VDPSOLQJ DQG G\QDPLFVZHXVHDKLJKOHYHOYDOLGDWLRQPRGHOFUHDWHGLQPXOWL
TXDQWLILFDWLRQ7KH2SHQ0RGHOLFD('$LVXVHGIRUWKHHQWU\RI GRPDLQ PDWKHPDWLFDO HTXDWLRQEDVHG 0RGHOLFD ODQJXDJH 7KH
WKH 9HULILFDWLRQ 0RGHO DQG UHVROYLQJ WKH PRGHO WR WKH PHUJLQJRIWKHPRGHOSDUWVUHOLHVRQ-XOLDFRPSXWLQJODQJXDJH
H[HFXWDEOH &FRGHG IXQFWLRQDO PRGHO WR EH LQWHJUDWHG LQ WKH FDSDELOLWLHVIRULQWHURSHUDWLQJZLWKGLIIHUHQWWRROV
KHWHURJHQHRXVPRGHOHQYLURQPHQW
7KHFRPSOH[LW\RIDV\VWHPLPSOLHVKLJKHUUHTXLUHPHQWVWR
D. Linking and Parallelization of the Simulation (Julia) VLPXODWLRQ SHUIRUPDQFH 7KH UREXVW WUHQG IRU ERRVWLQJ WKH
7KH RYHUDOO PRGHO KDV WZR W\SHV RI SDUDOOHOLVP 7KH ILUVW VLPXODWLRQVSHHGOLHVLQHYROYLQJQDWXUDOSDUDOOHOLVPLQV\VWHPV¶
VSDWLDOLQYDULDQFHLVUHODWHGWRWKH5)VLJQDOJHQHUDWLRQZKLOH PRGHOV ,Q RXU ZRUN WZR W\SHV RI SDUDOOHOLVP ± VSDWLDO DQG
0'PRYLQJRQDWUDMHFWRU\WKHJHQHUDWHGVXSHUSRVLWLRQRIWKH VWUXFWXUDO ± KDV EHHQ H[SORLWHG E\ XVLQJ D UHDFK VHW RI -XOLD
5)VLJQDOVDWDFHUWDLQ0'¶VSRVLWLRQLVQRWGHSHQGHQWRQWKH ODQJXDJH PHDQV IRU SDUDOOHO FRPSXWDWLRQ RUJDQL]DWLRQ ,Q RXU
SUHYLRXV ORFDWLRQ RI DQ 0' ± WKH GLVWXUEDQFH RI WKH FDVH D PXOWLFRUH V\VWHP KDV EHHQ XVHG IRU FUHDWLQJ D SDUDOOHO
HOHFWURPDJQHWLFILHOGLQDQ\VSDWLDOORFDWLRQLQWKH06FDQEH VLPXODWLRQ HQYLURQPHQW 7KH VLPXODWLRQ RI WKH RYHUDOO
FRPSXWHG LQGHSHQGHQWO\ 7KLV OHDGV WR WKH LGHD WKDW WKH 9HULILFDWLRQ 0RGHO VKRZHG OLQHDU VFDOLQJ RI WKH SDUDOOHO
FRPSXWDWLRQ RI WKH VXSHUSRVHG 5) VLJQDO DORQJ WKH ZKROH VLPXODWHGPRGHOVVKRZLQJ[SHUIRUPDQFHLQFUHDVHRQDQ
WUDMHFWRU\FDQEHGLYLGHGLQILQLWHVLPDOO\IRUSDUDOOHOFDOFXODWLRQ SURFHVVLQJFRUHPDFKLQHZKLFKFXUUHQWO\VDWLVILHVRXUQHHGVLQ
DVDIXQFWLRQRIPRGHOOLQJWLPH WLPHRI0'¶VPRYHPHQW 7KLV PRGHOOLQJWKHFKDQQHO'63EORFNDQGVKRUWWUDMHFWRU\UXQVRI
PHDQV WKDW IRU WKH VLJQDO FRPSXWDWLRQ WKH WUDMHFWRU\ FDQ EH 0'VIRU5)VLJQDOJHQHUDWLRQ
GLYLGHG LQ VXEWUDMHFWRULHV RI DQ\ OHQJWK ZLWK GLIIHUHQW
PRGHOOLQJWLPH¶VVWDUWSRLQW(YROYLQJWKLVSDUDOOHOLVPSRWHQWLDO $VDQH[WVWHSZHDLPRQVWXGLHVRQSHUIRUPDQFHVFDOLQJRQ
RIWKHPRGHOLVYHU\FULWLFDOIRUWKHRYHUDOOVLPXODWLRQGXHWRWKH D FOXVWHU W\SH RI WKH SDUDOOHO SURFHVVLQJ V\VWHP IRU WKH ORQJHU
*+]5)GRPDLQ7KHVHFRQGW\SHLVWKHVWUXFWXUDOSDUDOOHOLVPRI WUDMHFWRU\ UXQV 7KH RWKHU LQWHUHVWLQJ ILHOG WR DFKLHYH
WKH '63EORFN WKDW FRQWDLQV  VLPLODU VLJQDOSURFHVVLQJ SHUIRUPDQFHDFFHOHUDWLRQFDQEHWKHIXUWKHUH[SORUDWLRQRIWKH
FKDQQHOVFRUUHVSRQGHQWWRWKHQXPEHURI%6HPLWWHUVDQGGLIIHU LGHD RI VSDWLDO LQGHSHQGHQF\ RI HOHFWURPDJQHWLF ILHOG
LQ RQO\ FKDUDFWHULVWLF FRHIILFLHQWV RI HDFK FKDQQHO 7KXV WKH FRPSXWDWLRQWKDWFDQEHDGGUHVVHG E\ H[SORLWLQJ WKH *3*38
VLJQDOSURFHVVLQJFKDQQHOVFDQEHVLPXODWHGLQSDUDOOHO -XOLD¶VFDSDELOLWLHVIRUPDVVLYHGDWDSDUDOOHOFRPSXWDWLRQV

)RU WKH SXUSRVH RI SDUDOOHO VLPXODWLRQ RI WKH RYHUDOO '63 5()(5(1&(6
PRGHODPXOWLFRUHSURFHVVLQJV\VWHPLVIRXQGWREHVXIILFLHQW >@ <<DR6YDQ:DDVHQ5;LRQJ06FKLHN  0HWKRGDQGGHYLFH
7KH SURFHVVLQJ V\VWHP FDQ EH HLWKHU D GLVWULEXWHG FOXVWHU RU D IRUSRVLWLRQGHWHUPLQDWLRQU.S. Patent Application No. 16/330,768
VLQJOH PXOWLFRUH SURFHVVRU SDUDOOHO SURFHVVLQJ FRGH GRHV QRW >@ 5 ;LRQJ 6 YDQ :DDVHQ & 5KHLQOlQGHU 1 :HKQ  
QHHG DQ\ PRGLILFDWLRQ IRU ERWK V\VWHP W\SHV H[FHSW IRU D Ä'HYHORSPHQW RI D QRYHO LQGRRU SRVLWLRQLQJ V\VWHP ZLWK PPUDQJH
SUHOLPLQDU\ QHWZRUN FRQILJXUDWLRQ SURFHGXUH IRU WKH FOXVWHU SUHFLVLRQEDVHGRQ5)VHQVRUVQHWZRUN´IEEE Sensors Letters1  SS

V\VWHP 0HDQZKLOH WKH RYHUKHDG RI D ELJ GDWD WUDQVIHU RYHU D
>@ 3)ULW]VRQ9(QJHOVRQ
0RGHOLFD²$XQLILHGREMHFWRULHQWHGODQJXDJH
QHWZRUN FRQQHFWLRQ LQ FDVH RI D FOXVWHU FDQ EH FULWLFDO LQ IRUV\VWHPPRGHOOLQJDQGVLPXODWLRQ
,Q(XURSHDQ&RQIHUHQFHRQ
FRPSDULVRQ WR RUJDQL]LQJ GDWD FKDQQHOV LQ D XQLILHG PHPRU\ 2EMHFW2ULHQWHG3URJUDPPLQJ6SULQJHU%HUOLQ+HLGHOEHUJSS
DFFHVV DUFKLWHFWXUH LQ D VLQJOH PXOWLFRUH PDFKLQH 7KXV WKH >@ -%H]DQVRQ6.DUSLQVNL9%6KDK$(GHOPDQ
-XOLD$IDVWG\QDPLF
ODWWHU LV IRXQG WR EH VXIILFLHQW IRU WKH FXUUHQW YDOLGDWLRQ ODQJXDJH IRU WHFKQLFDO FRPSXWLQJ
  DU;LY SUHSULQW DU;LY
SURFHGXUH 
>@ 7KH -XOLD 3URMHFW
7KH -XOLD /DQJXDJH 0DQXDO 0XOWLSURFHVVLQJ DQG
&UHDWLQJWKHVSHFLILFVLPXODWLRQHQYLURQPHQWFDQEHVHHQLQ 'LVWULEXWHG &RPSXWLQJ
  >2QOLQH@ $YDLODEOH
WZRDVSHFWVGHYHORSLQJWKHLQIUDVWUXFWXUHIRUVSHFLILFGDWDW\SH KWWSVGRFVMXOLDODQJRUJHQYPDQXDOGLVWULEXWHGFRPSXWLQJ
KDQGOLQJ DQG DQDO\VLV DQGLQWHJUDWLQJWKHWZR PRGHOVLQWRWKH >$FFHVVHG$XJ@
HQYLURQPHQW XVLQJ VFULSWLQJ FDSDELOLW\ RI -XOLD ODQJXDJH 7KH >@ 7 %HVDUG & )RNHW DQG % 'H 6XWWHU (IIHFWLYH H[WHQVLEOH
VFULSWLQJFDSDELOLW\RIWKHODQJXDJHLVXVHGWRFUHDWHDUHXVDEOH SURJUDPPLQJXQOHDVKLQJ-XOLDRQ*38V,(((7UDQVDFWLRQVRQ3DUDOOHO
DQG'LVWULEXWHG6\VWHPV  
FRGH IRU FDOOLQJ WKH JHQHUDWHG &LPSOHPHQWDWLRQ RI WKH '63
>@ 6 'DQLVFK
$Q ,QWURGXFWLRQ WR *38 3URJUDPPLQJ LQ -XOLD
 
EORFN VLJQDOSURFHVVLQJ FKDLQ 7KH VFULSW DXWRPDWLFDOO\ >2QOLQH@ $YDLODEOH KWWSVQH[WMRXUQDOFRPVGDQLVFKMXOLDJSX
JHQHUDWHVD-XOLDIXQFWLRQDVDZUDSSHUIRUD6LPXOLQNPRGHODQG SURJUDPPLQJ>$FFHVVHG$XJ@
LQYRNHVWKHPRGHOEHKDYLRU7KHVDPHLVXVHGIRULQYRNLQJWKH >@ 7KH -XOLD 3URMHFW
7KH -XOLD /DQJXDJH 0DQXDO 0HWDSURJUDPPLQJ

9HULILFDWLRQ0RGHOIXQFWLRQDOLW\  >2QOLQH@ $YDLODEOH
KWWSVGRFVMXOLDODQJRUJHQYPDQXDOPHWDSURJUDPPLQJ >$FFHVVHG
9 5(68/76$1'287/22. $XJ@
7KHVWDUWLQJSRLQWRIWKHSUHVHQWHGZRUNZDVWKHYHULILFDWLRQ >@ :LNLSHGLD
)XQFWLRQDO 0RFNXS ,QWHUIDFH
  >2QOLQH@ $YDLODEOH
KWWSVHQZLNLSHGLDRUJZLNL)XQFWLRQDOB0RFNXSB,QWHUIDFH>$FFHVVHG
PRGHO FRPSOHWHO\ LPSOHPHQWHG LQ 0DWKZRUNV 6LPXOLQN WKDW $XJ@
GHOLYHUHGWKHVLPXODWLRQUHVXOWVRIDVWUDMHFWRU\LQWKHUDQJHRI
GR]HQVRIKRXUVFRPSXWDWLRQWLPHWKXVKLQGHULQJH[WHQVLYHDQG

136
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Pitfalls and Remedies in Modeling and Simulation


of Cyber Physical Systems
Alberto Falcone, Alfredo Garro
Department of Informatics, Modeling, Electronics and Systems Engineering (DIMES)
University of Calabria
via P. Bucci 41C, 87036 Rende (CS), Italy
{alberto.falcone, alfredo.garro}@dimes.unical.it

Abstract—The ever-growing advances in science and technol- from the general M&S lifecycle, the paper identifies some
ogy have led to a rapid increase in the complexity of most important pitfalls deriving from its application to CPS and
engineered systems. Cyber-physical Systems (CPSs) are the result presents remedies, which are already available in the literature,
of this technology advancement that involves new paradigms, ar-
chitectures and functionalities derived from different engineering to prevent and face them.
domains. Due to the nature of CPSs, which are composed of many The rest of the paper is structured as follows. Section II pro-
heterogeneous components that constantly interact one another vides an introduction to the essential concepts of the research
and with the environment, it is difficult to study, explain hypoth- domain. Section III presents some important pitfalls deriving
esis and evaluate design alternatives without using Modeling and
from the application of M&S to support the design, study,
Simulation (M&S) approaches. M&S is increasingly used in the
CPS domain with different objectives; however, its adoption is not and development of CPSs. In Section IV, for each identified
easy and straightforward but can lead to pitfalls that need to be pitfall a set of remedies, which are already available in the
recognized and addressed. This paper identifies some important literature, for addressing it are presented. Finally, conclusions
pitfalls deriving from the application of M&S approaches to the are presented in Section V.
CPS study and presents remedies, which are already available in
the literature, to prevent and face them.
Index Terms—Modeling and Simulation, Pitfalls, Cyber Phys- II. M ODELING AND S IMULATION OF C YBER P HYSICAL
ical Systems S YSTEMS

I. I NTRODUCTION Alur in [1] defines a Cyber-Physical System (CPS) as


“a collection of computing devices communicating with one
Over the years, Cyber-physical Systems (CPSs) have in-
another and interacting with the physical world via sensors
creased in complexity and sophistication since, in general,
and actuators in a feedback loop”. Thus, a CPS integrates the
they are composed of many components, which are often
Physical components (tangible physical devices) with Cyber
designed and developed by organizations belonging to several
subsystems (computational and communicational capabilities)
engineering domains, including mechanical, software, and
to pursue specific objectives. Examples of CPSs are Air Trans-
electronical. CPSs are considered one of the Key Enabling
portation Systems, Industrial Control Systems, Autonomous
Technologies (KETs) of the fourth industrial revolution, as
Vehicles, and Aerospace Systems [2].
they can be placed in the foreground for creating value along
the three dimensions of the digitalization of industries: smart CPSs are hybrid systems whose dynamics are regulated
product, smart manufacturing, and business models. As CPSs through a mix of continuous and discrete behaviours. Such
get increasingly complex, their study, design, and development systems evolve continuously over time and can switch to an
become tough without using Modeling and Simulation (M&S) operation mode during which state variables are atomically
approaches and techniques. M&S represents one of the most updated. Generally, the continuous behaviour is described by
important and effective method for designing, studying, and Ordinary Differential Equations (ODE), whereas the discrete
supporting the acquisition of knowledge of CPS in a vari- one is specified through a control graph. The state of a
ety of industrial and scientific domains such as automotive, hybrid system is defined by the values of its continuous
aerospace, and energy. M&S approaches allow effectively variables in a given discrete mode. The state changes either
analyzing and evaluating design alternatives by avoiding risks, continuously, based on the results produced by the differen-
costs and fails associated with extensive field experimentation; tial equations, or discretely according to the control graph
this opportunity becomes even crucial when complete and conditions. Continuous flow is allowed as long as invariants
actual tests are too expensive to be performed in terms of cost, hold, whereas discrete transitions occur as soon as switch
time, and other resources. The adoption of M&S approaches conditions, also associated with external and/or internal events,
is not easy and straightforward and can lead to pitfalls that are satisfied. A hybrid system can be formally represented as:
need to be recognized and addressed by researchers. Starting HS = hQ, Cv , Dv , W, δ, ti, where Q is a finite non-empty set
of states; Cv represents the set of continuous variables; Dv
978-1-7281-7343-6/20/$31.00 ©2020 IEEE represents the set of discrete variables; W represents the set

137
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

of initial conditions; t ∈ R+ is the time; and, δ is the transition


Problem
function, given by δ : Q × Σ → Q where Σ = {Cv ∪ Dv }. Definition
The definition of suitable methodologies and techniques to
support the realization of hardware and software systems have
Requirement
been central themes in the scientific community in the last Elicitation
decades. However, these efforts, which are appropriate for
building traditional hardware/software systems, fail in address-
ing cyber-physical systems due to significant differences in Conceptual
Validation
Model
their design characteristics. In order to face with these issues a
systematic approach is needed, starting from the identification Verification
and clarification of the pitfalls in M&S of CPSs and research Model Simulation
efforts to overcome them. Implementation Experiments

III. P ITFALLS IN MODELING AND SIMULATION OF CPS S Fig. 1. Modeling & Simulation lifecycle.

The process of M&S can be described as a seven-step


cycle [3], as depicted in Figure 1. The process starts with the The application of M&S approaches might encounter to
“Problem Definition” step, where the bounds of the system, many pitfalls such as, inaccurate and/or incomplete require-
the problems or a part of them to analyze are delineated. ments, system boundary identification, precision and com-
The qualitative and quantitative criteria to be used to evaluate plexity, implementation, and result interpretation. Each pitfall
and classify different system configurations are identified involves one or more steps according to the M&S lifecycle
along with the configurations of interest and hypotheses on depicted in Figure 1. Table I summarizes the identified pitfalls
system performance (e.g. availability, reliability, safety). In and related M&S steps where they may occur.
the “Requirement Elicitation” step, it is initially identified a
TABLE I
formal process by which to identify, analyze and evaluate both P ITFALLS AND RELATED M&S STEP ( S ) WHERE THEY MAY OCCUR .
the functional and non-functional requirements that the system
must meet. The requirements are also collected through inter- Pitfall M&S step(s)
views by involving users, customers, and other stakeholders. System boundary Problem Definition
Requirement Requirement Elicitation
At the third step, named “Conceptual Model”, an abstraction Precision and Accuracy All
of the CPS under investigation is defined. The essential CPS Complexity All
structure and behaviours, which are necessary to answer the Implementation Model Implementation
Data Quality Simulation Experiments
research questions identified during the “Problem Definition” Result interpretation Simulation Experiments
step, are identified. The activities carried out during this step
allow to identify and select only the relevant aspects of the sys- a) System boundary: In the Problem Definition step, one
tem leaving out the irrelevant ones. The obtained conceptual of the crucial decisions that researchers must face is to identify
model is well-structured and unambiguous, and could consists the system boundary that captures all the necessary aspects to
of SysML models, mathematical equations and logical control promote understanding on the system and answer the research
functions describing the CPS structure and behaviours [4]. questions. The system boundary can be seen as a conceptual
The conceptual model is used by the involved specialists to line that separates the system to study from everything else
understand the real system outline and how it is represented in [7]. Its identification is not a simple activity but researchers
the model [5]. In the “Model Implementation” step, the con- have to find a fair compromise between model fidelity and
ceptual model is translated into a computer acceptable form. management, since in general, the conceptual model of the
This activity involves the identification of the most suitable system is a simplified version of the real system [8]. Whether
methodology (e.g., discrete event, continuous, stochastic) and a more wider system boundary is defined, details increase
the selection of an appropriate simulation environment (e.g., and the conceptual model is more faithful to the real system
Simulink, Modelica, and HLA/RTI). The obtained model is and experiments results are more accurate; consequently, the
verified, in the “Verification” step, to evaluate whether it complexity of the conceptual model increases, making it
satisfy the conditions imposed at the “Conceptual Model” error-prone and increasing modeling and debugging times.
step. Verification techniques, which include inspection, traces, Vice versa, when a narrow system boundary is specified,
design analysis and specification analysis, allow to determine the conceptual model is easier to model and simulate, but
whether the virtual model is well-engineered and error-free [6]. some CPS parts (e.g. components and behaviours) including
In the “Simulation Experiments” step, a set of experiments are environmental effects are left out leading to a loss of accuracy
performed and evaluated with respect to the research questions. in the experiments.
Finally, in the “Validation” step, simulation results are used b) Requirement: The definition and management of sys-
to check whether the model meets the problem definition tem’s requirements are key aspects to a successful M&S
constraints and requirements [6]. project. Unfortunately, it is one of the most difficult things

138
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

to achieve, also because it involves many stakeholders from much to the model precision can lead to the risk of loosing
different research domains with multiple prospectives on the essential components of the system and related relationships.
system [9]. Throughout the “Requirement Elicitation” step, This complexity could make the simulation project fail because
there are several pitfalls that can determine the origin of implementation, verification, and validation activities of the
incorrect requirements: (i) Objectives not well-defined, CPSs model are compromised.
are designed to carry out their activities by continuously e) Implementation: Once the conceptual model is final-
interacting with the environment, in which it operates. When ized, it is implemented through the support of a Modeling
the CPSs objectives are not well-defined or stakeholders lose and Simulation software. Before starting to examine all the
sight of their range of action, the requirements will be too available M&S software, it is necessary to decide how to im-
general, leaving out the essential functions in favour of the plement the CPS structure and behaviours (e.g., SysML, ODE,
unnecessary ones; (ii) Inconsistent information, during the en- Control Graphs) and the kind of simulation to perform (e.g.,
tire M&S lifecycle, researchers are often involved in collecting Continuous Simulation, Discrete Event Simulation, Stochastic
information on the CPSs under study. When used elicitation Simulation) so as to capture the CPS evolutions and changes
approaches are unable to capture all the CPSs details, it over time. Nowadays, there are different M&S software, each
becomes difficult to classify, determine priorities (i.e., by level of which specialized to address specific kind of problems
of risk, difficulties, costs), and harmonize the often conflicting (e.g. Modelica, Simulink, and Wolfram SystemModeler), since
stakeholders’ needs/objectives; and, (iii) Excess information, the researchers involved in the CPS M&S are different and
the elicitation of requirements in long text-based documents belong to different research domains, the risk is to choose an
leads to confusions on the CPSs objectives and makes it unsuitable one that does not offer functionalities to manage
difficult to identify by researchers missing components and the simulation model.
environmental constraints. f) Data Quality: In the “Simulation Experiments” step,
c) Precision and Accuracy: Inaccuracy and imprecision generally three types of data can be used for performing
arise due to the hybrid nature of CPSs that involve both experiments on the CPS synthetic model: (i) Historical data,
continuous and discrete dynamics [10]. In some cases, math- past performance data of the overall CPS and their individual
ematical equations used to describe the continuous behaviors components along with environmental conditions; (ii) Real
are simple; thus, simulation results can be computed with- data, data coming from the real CPS in operation, i.e. from
out any numerical approximation. However, in most cases, sensors and actuators, outputs of components including in-
the continuous behaviors are complex and the mathematical formation coming from supplementary business systems; and,
equations involve also Partial Differential Equations (PDE) (iii) Synthetic data, data from engineers, machine learning and
and/or Integral Equations (IE), which cannot be solved in artificial intelligence systems. Sometimes, data on how the
a precise way but only through numerical approximations. system operated in the past, how it operates currently, and
Other sources of errors are related to the interactions between how the synthetic model relates to the real system, is little or
the continuous and discrete dynamics that may lead to Zeno no-usable. This lack of quality in the data makes it challenging
executions. The Zeno phenomenon occurs when the system to perform valuable simulation scenarios.
undergoes an unbounded number of discrete transitions in g) Result interpretation: After completing simulation
a finite and bounded length of time [11]. This phenomenon experiments, it would be necessary to perform some inter-
lead to simulation execution crash, simulation results are not pretation of the produced results to provide more readable
accurate, and the system behaviors are fundamentally ill- information and highlight critical aspects that deserve special
defined beyond the Zeno point. attentions. Each simulation experiment has a model configura-
d) Complexity: In the “Problem Definition” step, com- tion, fixed parameters and initial conditions that make results
plexity pitfalls may arise in the identification of the system different, and it is up to researchers their correct interpretation.
boundary and in the definition of the research questions. In the Interpretation pitfalls can arise when researchers interpret the
“Requirement Elicitation” step, complexity pitfalls may hap- results partially without taking into account aspects related to
pen in the capture and managing activities such as, ambiguity, the CPS structure, behaviors, and environmental conditions,
multiple requirements and undefined terms [12]. Upon delin- losing the critical distance from their work [14]. Moreover, it
eating the system boundary, formulating the research questions is important to favour the reproducibility of results, meaning
and capturing the requirements, the conceptual model needs that a simulation model should not provide different results
to be formalized. Its formalization, in the “Conceptual Model” for each execution with the same initial conditions [15].
step, implies the simplification of the CPS parts and their rela-
tionships existing in reality so as to increase the model’s utility IV. R EMEDIES IN MODELING AND SIMULATION OF CPS S
[13]. A proper simplification is very important for the success
of the simulation study, but at the same time, the conceptual This section presents some remedies that are already avail-
model has to represent reality with sufficient precision for the able in the literature to address the identified pitfalls.
simulation to produce reliable results. Having a too simple a) System boundary: Without a clear identification of
conceptual model does not allow to capture the fundamental the system boundary, which separates what lies within the
characteristics of the real system, whereas shift focus too CPS to be studied and what is outside (not necessary to

139
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

be analyzed), the whole M&S process is likely to fail. Sev- infinitesimal quantities numerically. To increase precision in
eral research efforts focused their attention on the definition the simulation calculations and results, it is necessary to adopt
of suitable methods, models and techniques to address this a new kind of a computer. Falcone et al. in [23] present an in-
aspect. In [16], a set of practices to correctly identify the novative solution that allows one to use the Infinity Computer
system boundary are delineated. According to these practices, arithmetic within the Simulink environment. The Simulink-
the boundary identification is carried out by selecting the based solution allows one to perform numerical computations
environment variables that the CPS monitor and control. with finite, infinite, and infinitesimal numbers, increasing the
Monitored variables are quantities of the environment whose precision of the computations. Regarding the accuracy issues,
values impact the CPS behavior (e.g., altitude, inclination, and in [24], the authors present a set of methods to improve
airspeed of an airplane), whereas controlled ones represent the accuracy of simulations involving CPS. Specifically, the
quantities of the environment that the CPS affect with their authors identified three groups: (i) methods to prevent the
behaviours (e.g. wing position of the airplane). Wittmann et occurrence of errors; (ii) methods to reduce current errors;
al. in [17] present a methodology to define functional system and (iii) techniques for reducing methodical errors.
boundaries necessary for evaluating the risk of an automated d) Complexity: Today’s CPSs are hard to design, develop
driving system observing functional system boundaries and and maintain since they are composed of many interconnected
system errors. The proposed methodology allows to model components that make them so large and detailed that no
a set of level of details that drive the definition of relevant one can understand their behaviours. Keeping complexity
scenarios and system boundaries to supports the identification under control is fundamental as too complex systems lead
of functional system boundaries. In [18], the International to an increase in costs and risks. Lindemann et al. in [25]
Council on Systems Engineering (INCOSE) delineates the present three main dimensions of complexity that emerge in
Systems Development Life Cycle (SDLC) process that utilizes the context of CPS design and development: (i) Structural
systems thinking principles to design, integrate, and manage Complexity; (ii) Dynamic Complexity; and (iii) Organizational
complex systems over their life cycles. It provides a guidance Complexity. For each of them, the main issues are presented
and rationale to establish the external and internal components along with possible solutions. To support the systematic and
of a system, and define its boundaries, including the interfaces holistic analysis of an engineering design process, Kreimeyer
that reflects the operational scenarios and expected system et al. in [26] present a measurement system that adopts a
behaviours. set of complexity metrics to integrate the process’ entities
b) Requirement: To address the pitfalls related to the (e.g. tasks, documents, and organizational units). Specifically,
definition and management of system’s requirements differ- 52 metrics have been defined for the structural analysis of
ent research efforts propose methodologies and techniques processes (e.g. timeliness and need for communication). The
to avoid them. Gillani et al. in [19] present a survey of metrics are supported by a meta-model for process modeling.
requirement techniques for managing Safety Critical Systems e) Implementation: There are different M&S soft-
(SCS). The authors analyzed activities and techniques that ware/environments both commercial and non-commercial
should be performed by RE during safety analysis. Moreover, highly specialized that allow the design and implementation
specified tools to support, in an integrate way, the safety of CPS. However, a single software/environment is not able
analysis between RE and SCS in Safety Engineering have to manage all the CPSs aspects, but it is tailored to address
been explored. In [20], the authors stress the importance of the a specific type of problem. Thus, a combination of more
requirement management as most acute knowledge intensive M&S software/environments is required. Xiao and Fan in [27]
activity for managing a complex system. The authors classify, present a framework, based on the Model-Drive Architecture
for each requirements elicitation step, the main issues and (MDA) and the IEEE 1516-2010 (HLA) standard [28], that
explore how Artificial Intelligence (AI) techniques can be a allows to design and simulate heterogeneous CPSs also by
viable techinique to overcome them. The paper also delin- reusing simulation models already available. In [2], the authors
eates the connection between the identified issues and their highlight the benefits coming from the joint exploitation of
potential AI explanations in many requirements elicitation Distributed Simulation (DS) and Co-Simulation approaches to
techniques. Milani in [21] claims that in modern organizations study CPSs. The paper proposes a solution that relies on the
the Business Process Model and Notation (BPMN) language is integration of the Functional Mock-up Interface (FMI) and the
widely used to facilitate communications between engineers, HLA standard for addressing, in an integrated way, the issues
stakeholders, and researchers to understand how a complex of reusability, interoperability and distribution of CPSs. To
system works [22]. The paper presents a BPMN-based method achieve this integration, the authors defined the Adapter-based
that guides the elicitation of requirements with the domain Hybrid Federate (A-HF) that allows to reuse a Functional
experts in a collaborative manner. Mock-up Unit (FMU) in co-simulation modality into an HLA
c) Precision and Accuracy: Numerical computing is a simulation in a conservative time-stepped manner.
key part of the traditional computer architecture, and almost all f) Data Quality: High-quality data is an important as-
traditional computers implement the IEEE 754-1985 standard pect to consider in order to successfully conduct simulation
to represent and work with numbers. However, due to archi- experiments involving CPSs. In the literature are available
tectural limitations it is impossible to work with infinite and different methodologies to support data collection and analyze

140
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

their quality. Kewei and Sherali in [29] claim that when data, [6] R. G. Sargent, “Verification and validation of simulation models,” in
mostly from the physical world, is collected in CPSs, one of Proceedings of the 2010 winter simulation conference, pp. 166–183,
IEEE, 2010.
the most important challenges is the detection and filtering [7] T. Li, H. Zhang, Z. Liu, Q. Ke, and L. Alting, “A system boundary
of faulty data. To improve the quality of the collected data, identification method for life cycle assessment,” The International
the authors argue that it is necessary the definition of suitable Journal of Life Cycle Assessment, vol. 19, no. 3, pp. 646–660, 2014.
[8] S. Mittal, U. Durak, and T. Ören, Guide to simulation-based disciplines:
algorithms to find out and filter incorrect data efficiently and Advancing our computational future. Springer, 2017.
cost-effectively. In the proposed work, the authors present the [9] C. Wohlin et al., Engineering and managing software requirements.
challenges and techniques for incorrect data detection and Springer Science & Business Media, 2005.
[10] C. Beisbart and N. J. Saam, Computer simulation validation. Springer,
filtering. In [30], the authors present empirical descriptions of 2019.
simulation data quality problems, data production processes, [11] J. Zhang, K. H. Johansson, J. Lygeros, and S. Sastry, “Zeno hybrid
and relations between these processes and simulation data systems,” International Journal of Robust and Nonlinear Control: IFAC-
Affiliated Journal, vol. 11, no. 5, pp. 435–451, 2001.
quality problems by evaluating a multiple-case study within [12] D. Zowghi and C. Coulin, “Requirements elicitation: A survey of tech-
the automotive domain. The obtained results have been used niques, approaches, and tools,” in Engineering and managing software
to define guidelines to support manufacturing companies in requirements, pp. 19–46, Springer, 2005.
[13] D. van der Zee, “Approaches for simulation model simplification,” in
improving data quality. 2017 Winter Simulation Conference (WSC), pp. 4197–4208, Dec 2017.
g) Result interpretation: The results deriving from the [14] R. Barth, M. Meyer, and J. Spitzner, “Typical pitfalls of simulation
CPSs simulations are generally only numbers, therefore it is modeling: lessons learned from armed forces and business,” The journal
of artificial societies and social simulation, vol. 15, no. 2, p. 5, 2012.
up to researchers their interpretation in order to answer the [15] O. Dalle, “On reproducibility and traceability of simulations,” in Pro-
research questions defined in the “Problem Definition” phase. ceedings of the 2012 winter simulation conference (WSC), pp. 1–12,
One of the main threats is their partially interpretation with IEEE, 2012.
[16] D. L. Lempia and S. P. Miller, “Requirements engineering management
respect to the hypothesis with which the virtual model was handbook,” National Technical Information Service (NTIS), vol. 1, 2009.
built. In [31], the authors highlight that the extraction of [17] D. Wittmann, C. Wang, and M. Lienkamp, “Definition and identification
knowledge from simulation results is becoming increasingly of system boundaries of highly automated driving,” in 7. Tagung
Fahrerassistenz, 2015.
important in the design and management of complex systems, [18] C. Haskins, “Incose systems engineering handbook: A guide for sytem
since simulation results tend to be dynamic, incomplete, and life cycle processes and activities,” INCOSE, 2007.
redundant. To address these issues and achieve knowledge [19] M. Gillani, A. Ullah, and H. A. Niaz, “Survey of requirement manage-
ment techniques for safety critical systems,” in 2018 12th International
from simulation results, the authors present a framework along Conference on Mathematics, Actuarial Science, Computer Science and
with data mining algorithms. The framework has been defined Statistics (MACS), pp. 1–5, 2018.
by using novel techniques based on Rough Sets Theory (RST) [20] S. Sharma and S. Pandey, “Integrating ai techniques in requirements
elicitation,” Available at SSRN 3462954, 2019.
and Principal Component Analysis (PCA) for selecting the [21] F. Milani, “Requirement elicitation using business process models,” in
main attributes and their implicit relationships to create an Digital Business Analysis, pp. 311–319, Springer, 2019.
object-oriented data model for the simulation results. [22] A. Falcone, A. Garro, A. D’Ambrogio, and A. Giglio, “Engineering
systems by combining BPMN and HLA-based distributed simulation,”
in 2017 IEEE International Conference on Systems Engineering Sympo-
V. C ONCLUSION sium, ISSE 2017, Vienna, Austria, October 11-13, 2017, pp. 1–6, 2017.
[23] A. Falcone, A. Garro, M. S. Mukhametzhanov, and Y. D. Sergeyev,
The contribution of the paper is twofold. On the one hand, “Representation of Grossone-based Arithmetic in Simulink for Scientific
it identified some important pitfalls deriving from the adoption Computing,” Soft Computing, pp. 1–15, 2020.
of M&S approaches to the CPS study, and links them to the [24] Y. Yatsuk and S. Yatsyshyn, “Metrological array of cyber-physical
systems. part 5. quality assurance in measuring instrument design,”
corresponding phase(s) of the M&S lifecycle. In this way, Sensors & Transducers, vol. 188, no. 5, p. 1, 2015.
researchers have a guide that supports them, according to the [25] U. Lindemann, M. Maurer, and T. Braun, Structural complexity man-
M&S phase in which the project is located, in identifying agement: an approach for the field of product design. Springer Science
& Business Media, 2008.
possible pitfalls. On the other hand it presents for each [26] M. Kreimeyer and U. Lindemann, Complexity metrics in engineering
identified pitfall some remedies that are currently available design: managing the structure of design processes. Springer Science
in the literature to overcome it. & Business Media, 2011.
[27] T. Xiao and W. Fan, “Modeling and simulation framework for cyber
physical systems,” in Advanced Methods, Techniques, and Applications
R EFERENCES in Modeling and Simulation, pp. 105–115, Springer, 2012.
[1] R. Alur, Principles of cyber-physical systems. MIT Press, 2015. [28] A. Falcone, A. Garro, A. Anagnostou, and S. J. E. Taylor, “An
[2] A. Falcone and A. Garro, “Distributed Co-Simulation of Complex introduction to developing federations with the High Level Architecture
Engineered Systems by Combining the High Level Architecture and (HLA),” in 2017 Winter Simulation Conference, WSC 2017, Las Vegas,
Functional Mock-up Interface,” Simulation Modelling Practice and NV, USA, December 3-6, 2017, pp. 617–631, 2017.
Theory, vol. 97, no. August, p. 101967, 2019. [29] K. Sha and S. Zeadally, “Data quality challenges in cyber-physical
[3] J. S. Carson, “Introduction to modeling and simulation,” in Proceedings systems,” Journal of Data and Information Quality (JDIQ), vol. 6, no. 2-
of the Winter Simulation Conference, 2005., pp. 8–pp, IEEE, 2005. 3, pp. 1–4, 2015.
[4] P. Bocciarelli, A. D’Ambrogio, A. Falcone, A. Garro, and A. Giglio, [30] J. Bokrantz, A. Skoogh, D. Lämkull, A. Hanna, and T. Perera, “Data
“A model-driven approach to enable the simulation of complex systems quality problems in discrete event simulation of manufacturing opera-
on distributed architectures,” SIMULATION: Transactions of the Society tions,” Simulation, vol. 94, no. 11, pp. 1009–1025, 2018.
for Modeling and Simulation International, vol. 95, no. 12, 2019. [31] X. Shi, J. Chen, H. Yang, Y. Peng, and X. Ruan, “A novel approach to
[5] M. L. Loper, “The modeling and simulation life cycle process,” in extract knowledge from simulation results,” The International Journal
Modeling and Simulation in the Systems Engineering Life Cycle, pp. 17– of Advanced Manufacturing Technology, vol. 20, no. 5, pp. 390–396,
27, Springer, 2015. 2002.

141
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Laying the path to consumer-level immersive


simulation environments
Lorenzo Donatiello Lorenzo Gasparini Gustavo Marfia
Dept. of Computer Science and Engineering Dept. of Computer Science and Engineering Dept. for Life Quality Studies
University of Bologna University of Bologna University of Bologna
Bologna, Italy Bologna, Italy Bologna, Italy
lorenzo.donatiello@unibo.it lorenzo.gasparini3@studio.unibo.it gustavo.marfia@unibo.it

Abstract—Virtual reality is slowly transitioning from a special- visualization frameworks [10]–[13]. No framework instead
ized laboratory-only technology, to a consumer electronics appli- exists, to the best of our knowledge, capable of supporting: (a)
ance. In this transition, two interesting research questions amount a wide set of applications to (b) simply select the 2D interface
to how 2D-based content and applications may benefit (or be
hurt) by the adoption of 3D-based immersive environments and elements which should be exported to a 3D immersive environ-
to how to proficiently support such integration. Acknowledging ment, while, (c) not requiring the creation of custom software,
the relevance of the former, we here consider the latter question, specific for the intended task. We here aim at moving a step
focusing our attention on the diversified family of PC-based forward along the path set by (a), (b) and (c), considering
simulation tools and platforms. VR-based visualization is, in fact, an application domain that has for long experimented and
widely understood and appreciated in the simulation arena, but
mainly confined to high performance computing laboratories. appreciated the opportunities laid by 3D interfaces and Virtual
Our contribution here aims at characterizing the simulation tools Reality (VR): scientific computation and simulation platforms
which could benefit from immersive interfaces, along with a [14]–[16]. An important body of evidence has demonstrated,
general framework and a preliminary implementation which may in fact, the benefits that can be attained when exploring
be put to good use to support their transition from uniquely 2D scientific data using immersive interfaces [17]–[19]. This work
to blended 2D/3D environments.
Index Terms—Virtual reality, OpenGL intercept, blended contributes to the research path presented so far with an
2D/3D interfaces, simulation environments. analysis of the graphical libraries utilized by a few of the
most widely used desktop computer simulation platforms and
I. I NTRODUCTION a preliminary implementation demonstrating the feasibility of
Many works have so far envisioned a future where 2D and the proposed technical approach. The remainder of this work
3D interfaces will both be supported by computing systems is organized as follows. In Section II we review the approaches
to provide better performances and experiences [1]–[9]. The taken, so far, in the development of VR applications. In Section
price drop of hardware components (e.g., Oculus Quest and III we delineate the scenario considered in this paper and ex-
Rift with prices below 500C and Oculus GO below 200C) plain the adopted architectural approach. Section IV describes
amounts to one of the factors that may make this happen. the results obtained in extending the interface capabilities
The integration of 2D/3D interface paradigms into software beyond 2D, for three different simulation platforms. To fully
platforms proceeds slowly, though, as it is not possible to benefit the scientific community, the code related to the work
observe a steep increase in the number of applications adding presented in this paper is available at [20].
immersive experiences to their traditional 2D ones. Such
resistance may be determined also by the fact that, to the best II. S TATE OF THE ART
of our knowledge, no general and simple approach has so far Scientific visualization, resorting to computer graphics, has
been developed to implement an easy transition from 2D to been so far used to represent: (a) data sets, which may be
2D/3D settings. The possible paradox, which may hence occur the output of numerical simulations, (b) recorded data, or, (c)
in the near future, is that hardware will be ready and cost- constructed shapes. VR, in particular, has aided in the display
effective for mass consumption, while a scarcity of software of 3D structures providing spatial and depth cues, as it allows
solutions will instead be available. In this paper we focus on a rapid and intuitive exploration of the volume containing the
such problem, which has been considered to some extent in data. The authors of [21], for example, have analyzed the
the past years in literature. Previous works, however, have usefulness of VR in specific task performance with volume
mainly concentrated on providing immersive interface support datasets, finding that such systems improve performance in
either through the provision of software platform specific add- spatial judgment tasks. More recently VR systems have been
ons or with the exhibition of dedicated APIs inside existing assessed in the visualization of complex weather-related infor-
mation [22]: to this aim, the effectiveness and usability of the
This work was supported by the University of Bologna’s AlmaAttrezzature
2017 grant. Xbox One controller in combination with a VR display proved
978-1-7281-7343-6/20/$31.00 ©2020 IEEE to be the most effective. Reski and Alissandrakis investigated

142
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the experienced workload and perceived flow of interaction


in the exploration of open data [23]: their findings indicated
a user preference for virtual representations. Implementation-
wise, VR visualizations have, so far, been run according to
one of the three following approaches:

• Native VR - Such applications are designed and developed


for the VR environment, therefore their user interface is
designed in a 3D way in the first place. Native approaches
typically provide the most advantages in terms immersion
and efficiency; Fig. 1. Reference scenario.
• VR Plugins - The application is updated by the producers
with the addition of a plugin that supports VR interactions
III. P ROPOSED SCENARIO AND SOLUTION
(e.g. SketchUp or AutoCAD) [10], [24]. The disadvantage
compared to a native VR approach is that the software is In this Section we sketch the idea behind the use case
designed for 2D interactions in the first place. However, scenario that we aim to support (Section III-A) and then move
since the vendors have access to the source code of their on to explain its possible implementation (Section III-B).
application, they can also extend their code-base to increase
the immersion of the virtual reality interface (such changes A. Scenario
may be expensive, though, if the application has already The reference scenario is represented in Figure 1. We
been released and has reached a considerable size); envision a situation where a 3D object produced by a sim-
• VR Porting - The application solely provides a 2D interface ulation software may be explored and manipulated within an
and its producers do not provide any type of plugin or update immersive environment. In terms of visualization, the aim is to
to support VR interactions. A solution has been found in the provide the standard 3D manipulation functionalities, typically
past, intercepting the calls made by the rendering library available in immersive environments, such as rotation, scale,
(e.g. DirectX, OpenGL) and redirecting them to an applica- translation along the three different axes and zoom [28]. In
tion that handles them, executing them in a VR environment addition to these, we aim at providing means of toggling
[11]–[13], [25]. This approach may not be as efficient as the on/off or highlighting specific 3D objects or classes of objects.
previously described ones, but provide the means to flexibly Clearly, maintaining such approach, the possible interactions
port any type of application, without source code access are those that may be implemented with the rendered graphics.
[11]. In terms of immersion, the main disadvantage is that it In a vehicular simulation scenario, for example, it may be
may be lower when compared to the previous approaches. It possible to highlight (e.g., change the color) to specific groups
may not be possible to achieve the same level of interaction, of automobiles to track their behavior. In the same scenario,
as the actions that may be implemented in a general way it may also be possible to toggle off all the vehicles that are
are limited to those involving the graphics components (i.e., moving, in order to focus on those that are not. As anticipated,
the calls made by the rendering library), while direct inter- we aim at implementing the depicted scenario for as many as
actions with the original application would require custom, simulation platforms as possible, while not requiring the users
per-application solutions, which are ill suited in the context of such platforms to write any lines of code. In the following
of a general framework. Some companies, such as Techviz we explain in detail how this may be done.
or Moreviz, offer porting frameworks based on OpenGL
intercept mechanisms: both specialize in the porting of CAD B. Practical solution: the OpenGL intercept approach
applications [26], [27]. However, both solutions come at a An approach to library injection into an application was
cost, with price expenditures exceeding those affordable to described in the literature for Matlab [11], [12]. Since Matlab
the consumer market. employs OpenGL as graphics library, the authors created an
OpenGL library clone as a middle-ware layer between the real
Compared to the experiences described in this Section, we OpenGL library and the application. The role of this library
place our contribution at the intersection of scientific com- was to redirect the calls made by Matlab to another application
puting, consumer-oriented solutions and virtual reality. With that handled the rendering in VR. Unfortunately the developed
this work we aim at reviving and extending the stream of application no longer works due to the change of the OpenGL
work started in [11], providing the building blocks necessary library usage by Matlab. Matlab now uses glDrawArrays
to support an immersive view of 3D content produced in calls instead of single vertex and color calls and the original
different scientific/simulation frameworks. Unlike [26], [27], application does not intercept them. We decided to build
however, our approach is an open source one. In the following, our proposal upon this approach, rewriting every part of the
we describe the path that has been started, according to a software architecture to handle the latest Matlab version (v.
vision which aims at providing easy-to-use 3D immersive R2019b). We’ve started by defining an OpenGL library which
environments for legacy simulation frameworks. is injected into the Matlab process. This library intercepts and

143
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 3. Example of Matlab porting.


Fig. 2. System architecture.

TABLE I
S CIENTIFIC COMPUTING AND SIMULATION PLATFORMS USING O PEN GL.

Software Verified from documentation Tested


Matlab X
Omnet [30] X
Qualnet [31] X
Arena [32] X
NetLogo3D X
Vissim [33] X
SUMO X

redirects some of the calls made by the main Matlab process


to our immersive application. Then our application executes
these OpenGL calls to render them into a virtual environment. Fig. 4. Example of SUMO porting.
Figure 2 represents the structure of the software architecture.
We can individuate two key components:
• ML2VR: the application that handles the information re-
anticipated, resorting to this approach and handling a large
ceived from the injected library, rendering it into a virtual number of OpenGL calls, it may be possible to render in VR
reality environment. For rendering purposes, the application any application that uses OpenGL as a graphics library. We
resorts to the OpenVR library, the API and runtime that al- hence checked how many simulation software platforms use
lows access to VR hardware from multiple vendors without OpenGL as a graphics library and tried implementing the same
requiring hardware knowledge [29]. approach for them. Table I lists the scientific computing and
• Injected OpenGL: this is the library that is injected into
simulation platforms that have been analyzed for the purpose
Matlab. This library redirects the relevant OpenGL calls of this work. We were able to verify their OpenGL support in
to the ML2VR application. To inject the library into the three different ways:
process is used a third application called Injector. This • Verified from documentation amount to those programs
application requires the PID (process ID) of the process and that, according to the documentation, support OpenGL as a
handles the injection of the library in that process. graphics library;
• Tested are the programs for which we have implemented
IV. P RELIMINARY R ESULTS and tested the porting.
Figure 3 shows an example of Matlab rendering porting. Figure 4 represents the porting of SUMO, a simulation
In the lower picture we show a plot rendered directly from platform used for the study of vehicular flows [34]. The lower
Matlab, in the upper one we have the same plot rendered part of Figure 4 represents a track and a car simulated in
with the injected library, in a separate 3D environment. As SUMO, whereas the upper one shows the same track and car

144
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

rendered in our 3D environment. The SUMO example also [10] Vr software for virtual reality design — autodesk. [Online]. Available:
provides us the means of an exemplar problem that may occur https://www.autodesk.com/solutions/virtual-reality
[11] D. J. Zielinski, R. Kopper, R. P. McMahan, W. Lu, and S. Ferrari,
when rendering in a 3D environment. In Figure 4, the car in “Intercept tags: enhancing intercept-based systems,” in Proceedings of
the 3D environment (the black triangle in the top picture) is the 19th ACM Symposium on Virtual Reality Software and Technology,
not near but far from the track. This happens because, usually, 2013, pp. 263–266.
[12] D. J. Zielinski, R. P. McMahan, S. Shokur, E. Morya, and R. Kop-
2D applications rely on the Z axis to sort the elements on per, “Enabling closed-source applications for virtual reality via opengl
the screen using values that, in a 3D context, may create an intercept-based techniques,” in 2014 IEEE 7th Workshop on Soft-
incorrect display of the objects. The Z value, assigned to the ware Engineering and Architectures for Realtime Interactive Systems
(SEARIS). IEEE, 2014, pp. 59–64.
car of Figure 4, is too large for a proper 3D render. Finally, [13] P. O’Leary, S. Jhaveri, A. Chaudhary, W. Sherman, K. Martin, D. Lonie,
we are also working on the NetLogo3D porting which we E. Whiting, J. Money, and S. McKenzie, “Enhancements to vtk enabling
may nevertheless confirm based on OpenGL calls and hence scientific visualization in immersive environments,” in 2017 IEEE Vir-
tual Reality (VR), 2017, pp. 186–194.
extensible according to the proposed approach. [14] C. Shaw, M. Green, J. Liang, and Y. Sun, “Decoupled simulation in
virtual reality with the mr toolkit,” ACM Transactions on Information
V. C ONCLUSION AND FUTURE WORKS Systems (TOIS), vol. 11, no. 3, pp. 287–317, 1993.
This work wants to reconnect to a stream of works that have [15] C. J. Turner, W. Hutabarat, J. Oyekan, and A. Tiwari, “Discrete event
simulation and virtual reality use in industry: new opportunities and
been published in the past and which have had the merit of future trends,” IEEE Transactions on Human-Machine Systems, vol. 46,
indicating a pathway for the provision of 3D immersive expe- no. 6, pp. 882–894, 2016.
riences, also for all those applications which are not designed [16] I. J. Akpan, M. Shanker, and R. Razavi, “Improving the success of
simulation projects using 3d visualization and virtual reality,” Journal
VR-ready. In particular, our contribution wants to respond to of the Operational Research Society, pp. 1–27, 2019.
the needs of a niche of users that have always demonstrated [17] K. Gruchalla, “Immersive well-path editing: investigating the added
interest towards immersive technologies: scientific computing value of immersion,” in IEEE Virtual Reality 2004. IEEE, 2004, pp.
157–164.
and simulation research professionals. The proposed approach [18] A. Forsberg, M. Katzourin, K. Wharton, M. Slater et al., “A comparative
may hence, at once, serve an interested group of users, while study of desktop, fishtank, and cave systems for the exploration of
fostering the development of a set of technologies which volume rendered confocal data sets,” IEEE Transactions on Visualization
and Computer Graphics, vol. 14, no. 3, pp. 551–563, 2008.
may in the near future bloom also in the general consumer [19] Y. Peng, Y. Ma, Y. Wang, and J. Shan, “The application of interactive
market. Future works will require the completion of the im- dynamic virtual surgical simulation visualization method,” Multimedia
plementation and a thorough experimentation, which may also Tools and Applications, vol. 76, no. 23, pp. 25 197–25 214, 2017.
[20] Varlab website. [Online]. Available:
include performance evaluation and human-computer interac- https://site.unibo.it/varlab/en/projects/code-and-demos
tion approaches, with the 3D immersive scientific computing [21] B. Laha, D. A. Bowman, and J. J. Socha, “Effects of vr system
and simulation environments supported within the proposed fidelity on analyzing isosurface visualization of volume datasets,” IEEE
Transactions on Visualization and Computer Graphics, vol. 20, no. 4,
framework. pp. 513–522, 2014.
[22] B. J. Andersen, A. T. Davis, G. Weber, and B. C. Wünsche, “Immersion
R EFERENCES or diversion: Does virtual reality make data visualisation more effec-
[1] K. Risden, M. P. Czerwinski, T. Munzner, and D. B. Cook, “An initial tive?” in 2019 International Conference on Electronics, Information,
examination of ease of use for 2d and 3d information visualizations and Communication (ICEIC). IEEE, 2019, pp. 1–7.
of web content,” International Journal of Human-Computer Studies, [23] N. Reski and A. Alissandrakis, “Open data exploration in virtual reality:
vol. 53, no. 5, pp. 695–714, 2000. a comparative study of input technology,” Virtual Reality, vol. 24, no. 1,
[2] A. G. Sutcliffe and K. D. Kaur, “Evaluating the usability of virtual pp. 1–22, 2020.
reality user interfaces,” Behaviour & Information Technology, vol. 19, [24] 3d design software — sketchup. [Online]. Available:
no. 6, pp. 415–426, 2000. https://www.sketchup.com
[3] J. J. LaViola Jr, “Bringing vr and spatial 3d interaction to the masses [25] G. Marino, D. Vercelli, F. Tecchia, P. S. Gasparello, and M. Bergam-
through video games,” IEEE Computer Graphics and Applications, asco, “Description and performance analysis of a distributed rendering
vol. 28, no. 5, pp. 10–15, 2008. architecture for virtual environments,” in 17th International Conference
[4] W. Cellary and K. Walczak, Interactive 3D multimedia content: models on Artificial Reality and Telexistence (ICAT 2007). IEEE, 2007, pp.
for creation, management, search and presentation. Springer, 2012. 234–241.
[5] D. A. Bowman, R. P. McMahan, and E. D. Ragan, “Questioning [26] Techviz website. [Online]. Available: https://www.techviz.net
naturalism in 3d user interfaces,” Communications of the ACM, vol. 55, [27] Moreviz website. [Online]. Available: http://www.more3d.com/
no. 9, pp. 78–88, 2012. [28] L. Yu, P. Svetachov, P. Isenberg, M. H. Everts, and T. Isenberg, “Fi3d:
[6] A. Cockburn and B. McKenzie, “3d or not 3d? evaluating the effect of Direct-touch interaction for the exploration of 3d scientific visualization
the third dimension in a document management system,” in Proceedings spaces,” IEEE transactions on visualization and computer graphics,
of the SIGCHI conference on Human factors in computing systems, 2001, vol. 16, no. 6, pp. 1613–1622, 2010.
pp. 434–441. [29] Openvr sdk. [Online]. Available:
[7] R. Alkemade, F. J. Verbeek, and S. G. Lukosch, “On the efficiency of a vr https://github.com/ValveSoftware/openvr
hand gesture-based interface for 3d object manipulations in conceptual [30] Omnet++ simulation manual. [Online]. Available:
design,” International Journal of Human–Computer Interaction, vol. 33, https://doc.omnetpp.org/omnetpp/manual//sec:graphics:overview
no. 11, pp. 882–901, 2017. [31] Qualnet manual. [Online]. Available: https://www.scalable-
[8] L. Donatiello, E. Morotti, G. Marfia, and S. Di Vaio, “Exploiting networks.com/products/qualnet-network-simulation-software-tool/
immersive virtual reality for fashion gamification,” in 2018 IEEE 29th [32] Arena installation notes. [Online]. Available:
Annual International Symposium on Personal, Indoor and Mobile Radio https://www.arenasimulation.com/
Communications (PIMRC). IEEE, 2018, pp. 17–21. [33] Vissim faq. [Online]. Available:
[9] E. Morotti, L. Donatiello, and G. Marfia, “Fostering fashion retail https://www.ptvgroup.com/en/solutions/products/ptv-vissim/knowledge-
experiences through virtual reality and voice assistants,” in 2020 IEEE base/faq/visfaq/search/
Conference on Virtual Reality and 3D User Interfaces Abstracts and [34] Sumo - simulation of urban mobility. [Online]. Available:
Workshops (VRW). IEEE, 2020, pp. 338–342. http://sumo.sourceforge.net/

145
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Energy and Distance evaluation for Jamming Attacks


in wireless networks
Emilie Bout and Valeria Loscri Antoine Gallais
FUN - Self-organizing Future Ubiquitous Network LAMIH lab, CNRS
Inria Lille - Nord Europe, avenue Halley Polytechnic University Hauts-de-France, Le Mont Houy
Villeneuve d’Ascq, France Valenciennes, France
{emilie.bout,valeria.loscri}@inria.fr antoine.gallais@uphf.fr

Abstract—Wireless networks are prone to jamming-type attacks has been the subject of a few studies in recent years under the
due to their shared medium. An attacker node can send a radio name of jammer placement problem. The goal of this problem
frequency signal and if this signal interferes with the ”normal” is to find the optimal position of the jammer to minimize the
signals of two communicating nodes, the communication can be
severely impacted. In this paper, we examine radio interference throughput of the network. Studying this dilemma would make
attacks from the jamming node perspective. In particular, we it possible to improve detection methods, such as the location
assume a ”greedy” jamming node, whose main twofold objectives of jamming nodes [5], [6].
are to attack and interfere the communication of a transmitter
In [7], the authors study the impact of several types of
and a receiver node, by minimizing its energy consumption and
maximizing the detection time. The two communication nodes are jammers as a function of their distance from the victim nodes
static during the attack window time, while the attacker node and the size of packets. They deduce that the closer the
can adapt its distance from the transmitter in order to select attacker is to his victim, the more effective it is. However,
the most suitable range for a successful interference. In order this also leads to a high probability of detection. Panyim et al.
to take into account the distance factor for the effectiveness of
wondered if the random positioning of a jammer can be more
the attack, we derive an optimization model for representing the
attack and we will study the key factors that allow effective and effective than when the choice of the position of the attacker
efficient implementation of a jamming attack, namely a) the energy is made strategically [8]. They conclude that the aggressor
b) the detection time and c) the impact on the transmission in has more impact on the network when the jammer is situated
terms of lowering the PDR. Three different types of attacks will next to a node where a lot of data transits. The number of
be analyzed, 1) Constant Jamming, 2) Random Jamming and 3)
jamming devices (and their locations) required to suppress a
Reactive Jamming. Simulation results show that the effectiveness
of a jamming attack in respect to the others not only depends on given network was also investigated [9]. They compare the
the position of the jamming node but also on the distance between impact of the jammer when it is placed at random and when
the transmitter and receiver nodes. it is placed on a uniform grid. This placement problem can
Index Terms—Placement jammer problem, Jamming attacks, be formulated in the form of an optimization problem where
Security, Wireless Networks.
the goal is to corrupt a maximum number of packets from the
target network, while keeping a low detection probability [10].
I. I NTRODUCTION
This study is inspired by those previous works but takes
The inherent openness of the wireless transmission medium into account the fact that the attacker is also a constrained
has made wireless communication systems particularly vulner- node (e.g., energy, computation). By considering the attacker
able to a multitude of attacks. One of the biggest threats to perspective, we show here that there exists a trade-off between
these communication systems is the jamming attack, in part the efficiency of a jammer, its distance from the communication
by its ease of implementation. This kind of attack consists and its energy consumption. We assume an attacking node
in intentionally interfering with the communication medium which aims to interfere the communication as much as possible,
to keep it occupied or to corrupt data in transit to cause a while maximizing its impact on the network and minimizing
denial of service (DoS). Most research has been focused on its energy consumption and its probability of being detected.
creating new detection methods or countermeasures [1]–[4].
Nevertheless little work has been oriented towards optimizing We use the simulator NS-3 [11] to compare the energy
the impact of these attacks. consumption spent by the three distinct jamming strategies, as a
The effectiveness of a jamming attack is based on many pa- function of its distance from the victim node and the distance
rameters such as the transmission properties (e.g., modulation, between the transmitter and the receiver. Our analysis show
power), the characteristics of the network (e.g., routing), or also that for each, the distance between the two communication
the strategy of the jammer along with its position. The last point nodes influences the jamming efficiency and the probability of
being detected. We also expose that for each scenario, there is
a position of the attacker which makes it possible to reduce its
978-1-7281-7343-6/20/$31.00 ©2020 IEEE energy consumption and its probability of being detected while

146
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

having a reasonable impact on the networks. upon packet transmission. This strategy reduces attack time and
The main objective of this study is to prove that the choice increases its effectiveness because the attacker no longer blindly
of the optimal interference strategy does not only depend on jams the network.
its position in the network but also on its energy consumption We have chosen to implement three jamming approaches
and its probability of being detected. inspired by those mentioned above. Our first strategy: Constant
This article is organized as follows. In section II, the network Interval Jammer consists in injecting packets on the channel for
model, the jamming attack strategies, and the detection issue are a certain period at regular time intervals. We have chosen here
described. We introduce, in section III the problem formulation a time interval between two very short jammings in order to
and we provide details of simulations and results in section IV. corrupt a maximum of packets.
We conclude the paper in section V. The second is an implementation of a Randon jammer which
randomly draws the duration during which it will remain in an
II. S YSTEM MODEL
idle state after each sending of packets in a given interval. The
A. Network Model aggressor, therefore, alternates the two states randomly. The
We consider a wireless communication scenario with one last implementation corresponds to a Reactive Jammer.
transmitter, one receiver and one jammer. We assume that radios Table I shows the send interval for each type of jammer
have equal transmit power and equal noise power. We assume during the simulation.
that nodes are limited in energy. We define an amount of energy
in the initial state E0 . At the end of each transmission or each Constant
Random Reactive
Parameters Interval
change of state of a device, the consumed energy of a node is Jammer
Jammer Jammer
calculated as follows: Send interval
Send interval Between 100
1 of the
Ei+1 = Ei + V ∗ (ti+1 − ti ) ∗ Ii , (1) (ms) and 1
legitimate node
Energy (J) 55 55 55
where Ei is the energy consumption at time ti , V is the Supply
3 3 3
supply voltage and Ii is the total current draw at node i. voltage (V)

B. Attacker Model TABLE I: Jamming node Settings.


The attacker has the same configuration as the legitimate
nodes in order to reduce the probability of being detected.
C. Attack Detection Model
To best correspond to reality, the attacking device is also an
energy-constrained node. The energy consumption is calculated One of the most used metrics to detect a jamming attack is
in the same way as for the other nodes of the network by the Packet delivery ratio (PDR) [7], [12], mentioned below:
following the formula 1. P
A jamming attack has the purpose of causing a denial of Number of PSD
Packet Delivery Ratio = P , (2)
service by preventing the exchange of packets between the Number of PT
legitimate nodes of the network. The jammer has the option where PSD is the number of packets successfully received at
of voluntarily occupying the channel or causing collisions in the destination and PT represents the actual number of packets
order to corrupt the packet and force the node to retransmit. transmitted at the source. In our case, the update of the global
Several jamming strategies [2], [3], listed below, have been PDR of the network is done after each packet sent by the
developed over the years to make the jammer more efficient transmitter. Detection is done at a regular time-frequency and
and less detectable. is based on a predefined detection threshold when the network
Constant Jammer: The basic strategy is to continuously is set up. When the PDR comes to be below the detection
send random bits on the channel to occupy the transmission threshold an attack is then identified. We assume that we are
channel for a certain time. However, from an attacker’s point in an optimal situation where the PDR ratio with no attack, is
of view, this strategy consumes a lot of energy and is easily 100%. Therefore, during the simulations, we have defined the
identifiable. detection threshold at 99%.
Deceptive Jammer: Instead of sending random bits, the
jammer injects packets continuously on the channel. The goal III. P ROBLEM F ORMULATION
is to deceive the receiver so that it remains in reception status.
In this section we propose a formulation of the problem from
Just like for the first strategy this one consumes a lot of battery
an attacker point of view in the discrete-time domain.
and resource for an attacker.
Random Jammer: This method allows the attacker to save For each time slot t, we define a variable xt (i) ∈ [0, 1] for all
energy by going from an active state to a sleeping state at the positions/distances of the jamming node. We assume that
random time intervals. During the active state, the jammer can the achievable rate between the transmitter and receiver can be
choose between the two approaches seen above. approximated with link capacity c defined as:
Reactive Jammer: This tactic aims to minimize the risk of S
being detected. Therefore, the attacker jams the channel only c = W ∗ log2 (1 + ), (3)
N

147
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

S
where W is the system bandwidth and is the signal to subject to
N D
noise ratio between the transmitter and receiver. We assume X
that in the absence of any external interference (i.e., jamming 1xt (i) < δ, (9)
i=1
attacker), the achieved capacity only depends on the reciprocal
D
distance between the two communicating nodes. X
Πi xt (i) = 0, (10)
Since a ”greedy” jamming node is considered, its main
i=1
objective is to decrease the effective Packet Delivery Ratio
(PDR), by minimizing its energy expenditure (which depends where δ is a threshold distance (beyond this distance the
on its distance from the transmitter) and increasing its detection attack has no effect on the transmission), λe is a variable for
time. Intuitively, if the attacker is close to transmission, it will considering the importance of the energy consumption, while
be more effective by spending less energy (that is adjusted with the variable λd is to consider the detection factor. The equation
the distance), yet its attack can be a failure since the detection (10) means that for each distance there is at least one slot where
time can be really fast. Since we consider three different aspects the transmitter and the attacker send data in the same slot. This
that can be opposite to each other, we formulate three different optimization problem is non-linear and the different types of
functions F1 , F2 and F3 . F1 is for characterising the goal of attacks considered will not be optimal. Such as an example,
impacting the PDR of the communication. In particular, in time the reactive jamming tries to ”intercept” the transmission, but
slot t, the achieved rate in respect of the distance i is: in order to do that the energy consumption will be larger.
Hereafter, we evaluate the different types of attacks in respect
of the impact on the PDR, the energy consumption of the at-
Rt (i) = xt (i) ∗ ct (i), (4)
tacker node and the detection time. In particular, we implement
and the function F1 can be defined as: the different functions F1 , F2 and F3 and we evaluate them for
the different types of attacks.
T X
X D T X
X D IV. P ERFORMANCE E VALUATION
F1 = E[Rt (i)] = E[xt (i) ∗ ct (i)], (5)
A. Simulation Details
t=1 i=1 t=1 i=1
The jamming attacks were simulated using the discrete event
where T is the total number of time slots, D is the distance,
simulator NS-3 (Network Simulator-3). The parameters set
E is the expectation and is with the respect of randomness
during simulations are shown in Table II. The transmitter
of ct (i), computed as in (3). Hereafter, E[.] will indicate the
constantly transmits packets every 0.1 seconds and begins its
average. F1 is for accounting the fact that if the transmissions
transmission at the start of the simulation (t = 0). The jammer
of both the emitting and the jamming nodes happen in the
aims to jam the transmitter node.
same time slot, they will collide with high probability. This
means that if the packet reaches the receiver, it will fail the Parameter Name Setting Used
CRC control, thus getting discarded, with a negative effect Radio Propagation Model Friis Propagation Loss Model
on the PDR. The function F2 is for accounting the energy
Routing protocol Ad-hoc routing
expenditure of the jamming node, depending on its distance
to the transmission, and can be expressed as: Energy Model EnergyBasicModel
Size of Legitimate Packet(octets) 1000
D
X Send interval legitimate nodes(s) 0.1
F2 = E[i2 ] (6)
i=1 TABLE II: Simulation and Node Parameters.
The function F3 accounts for the detection time, that is
proportional to the distance of the jamming node. The greater B. Results and Analysis
the distance of the attacker, the longer it would take to detect Our objective is to evaluate the impact of the different kinds
the attack. However, if the attacker is too far, an effective of jamming attacks on the network as function of the placement
attack would have a smaller impact while requiring more energy of the malicious node. A study about energy consumption as
consumption for the attacker node. We thus compute F3 as a function of the placement of the attacker is also carried. In
follows: particular, we evaluate the three different types of jamming
D
X attacks, the constant interval jammer, the random jamming
F3 = E[En (i)], (7) and the reactive jamming by considering the three factors a)
i=1 Detection Time; b) Energy Spent; c) Packet Delivery Ratio
(PDR) in a sinergic way. Indeed, in order to be effective, an
where En (i) is a function proportional to the distance.
attack has to be detected as late as possible (high detection
We then compute: time), the attacker has to minimize its energy consumption and
the PDR between transmitter and receiver has to be impacted
min(F1 + λe ∗ F2 − λd ∗ F3 ) (8) as much as possible.

148
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

The first type of simulations are based on a distance between than 2 Joules, the detection time is increasing and achieves 3
the transmitter and the receiver of 20 meters. In Figure 1, we seconds around 35 meters but the PDR is sensibly impacted
report the a) Detection Time as function of distance, the b) by considering that up to 30 meters of the attacker distance,
Total Energy Spent for an attack by the jamming node and the the PDR is smaller than 80%. It is worth to recall that we are
c) Packet Delivery Ratio of the communication between the considering an ideal scenario, where no other communication
transmitter and the receiver. interfere with the channel of the transmitter and receiver, so
we expect 100% as PDR. The constant jamming impacts the
channel of a 20% in terms of PDR.
In order to evaluate how the distance between the transmitter
and receiver impacts on the effectiveness of a jamming attack,
when the same power level of the transmitter is considered, we
increase the distance between the sending node and the receiver
to 60 meters. This scenario confirms that the most effective
attack is the reactive jamming.
Indeed, when the jamming node is positioned at around
50 meters from the transmitter, detection time is around 2.4
seconds and achieves 3 seconds at 65 meters. The PDR is
highly impacted since it reaches 70% at 50 meters and 90%
(a) at 65 meters. In practice, the optimal position of the reactive
jamming in this scenario is around 50 meters with an energy
consumption around 2 joules. The others two attacks have a low
energy consumption, but their attacks are not effective since the
detection time is almost constant and equal to 1 second (i.e.,
the attack is soon detected) and just increases a little bit around
80 meters.
C. Discussion
The analysis dealt in the different scenarios arises some
interesting observations. First of all, as already assessed in other
previous works, there is a strong relation between the position
(b) of an attacker and its effectiveness in a wireless context. As
the attacker considered in this work is a greedy node, aiming at
being effective in terms of impact (i.e. by lowering the PDR) but
with the minimum energy consumption, our evaluation allowed
to understand that different types of attacks can be more effec-
tive based on different distances between two communication
nodes. In the specific scenarios considered, the constant attack
is with more impact than the random and the reactive ones,
when the distance between the two communicating nodes is
small (e.g., 20 meters). On the other hand, the reactive jamming
is more effective when the distance between transmitter and
receiver increases. A jamming node can easily implements the
three different types of attacks by switching from one to the
(c)
other, based on the specific situation of the two nodes that are
Fig. 1: Distance between Transmitter and Receiver equal to communicating. It is sufficient for the attacker node to listen
20 meters (a) Detection Time; (b) Total Energy spent by the for a sufficient time in order to acquire the needed data and
jamming node. (c) Packet Delivery Ratio. infer information as the distance between the two nodes.
In this work we have considered an ”ideal” scenario where
Among the three types of attacks, the reactive jamming is only two nodes are exchanging data, so no external interference
less detectable than the constant and random ones. On the other is considered; the detection is also ideal, in the sense that it is
hand, the energy depleted by the reactive jamming node is much with a fixed threshold and we assume it is able to perfectly
higher than for the others two types of attacks. Moreover, the detect the jamming attack with no false alarm. This is not true
detection time increases for constant and random attacks when in a realistic scenario, where lower PDR can be caused for
the attacker is positioned around 25 − 35 meters. different reasons and the detection scheme needs to account
In particular, the constant jamming is more effective in this for all these situations. The main objective of this analysis was
distance interval, since the energy wasted for the attacks is less to highlight not only the dependence of the attacker position

149
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the impact of the attack in terms of PDR and the detection


time have to be considered in a unique framework in order to
evaluate the best positioning and also the best type of jamming.
Indeed, results have shown that a type of jamming attack can be
more effective than others, depending on the relative distance
of the transmitter and receiver nodes. Based on these results, it
would be interesting thereafter to consider an attacker which
would select the most appropriate jamming strategy and its
position in the networks according to these studied parameters.
Our future work will be based on a network composed of
numerous nodes or mobile nodes or both. It would also be
(a) interesting to study this problem on multi-channel networks.
Indeed, all these parameters can vary the impact of each attack
strategy.
ACKNOWLEDGMENT
This work was partially supported by the General Armament
Direction, France and the Defense Innovation Agency, France.
R EFERENCES
[1] S. R. Ratna and R. Ravi, “Survey on jamming wireless networks:
Attacks and prevention strategies,” International Journal of Computer
and Information Engineering, vol. 9, no. 2, pp. 642 – 648, 2015.
[Online]. Available: https://publications.waset.org/vol/98
[2] T. Hamza, G. Kaddoum, A. Meddeb, and G. Matar, “A survey on
(b) intelligent mac layer jamming attacks and countermeasures in wsns,” in
2016 IEEE 84th Vehicular Technology Conference (VTC-Fall), 2016, pp.
1–5.
[3] S. Jaitly, H. Malhotra, and B. Bhushan, “Security vulnerabilities and
countermeasures against jamming attacks in wireless sensor networks: A
survey,” in 2017 International Conference on Computer, Communications
and Electronics (Comptelix), 2017, pp. 559–564.
[4] K. Grover, A. Lim, and Q. Yang, “Jamming and anti-jamming
techniques in wireless networks: A survey,” Int. J. Ad Hoc Ubiquitous
Comput., vol. 17, no. 4, p. 197–215, Dec. 2014. [Online]. Available:
https://doi.org/10.1504/IJAHUC.2014.066419
[5] Q. M. Ashraf, M. H. Habaebi, and M. R. Islam, “Jammer localization
using wireless devices with mitigation by self-configuration,” PLOS
ONE, vol. 11, no. 9, pp. 1–21, 09 2016. [Online]. Available:
https://doi.org/10.1371/journal.pone.0160311
[6] T. Wang, T. Liang, X. Wei, and J. Fan, “Localization of directional
(c) jammer in wireless sensor networks,” in 2018 International Conference
on Robots Intelligent System (ICRIS), 2018, pp. 198–202.
Fig. 2: Distance between Transmitter and Receiver equal to 60 [7] W. Xu, W. Trappe, Y. Zhang, and T. Wood, “The feasibility of
launching and detecting jamming attacks in wireless networks,” in
meters (a) Detection Time, and (b) Total Energy spent by the Proceedings of the 6th ACM International Symposium on Mobile Ad
jamming node. (c) Packet Delivery Ratio. Hoc Networking and Computing, ser. MobiHoc ’05. New York, NY,
USA: Association for Computing Machinery, 2005, p. 46–57. [Online].
Available: https://doi.org/10.1145/1062689.1062697
[8] K. Panyim, T. Hayajneh, P. Krishnamurthy, and D. Tipper, “On limited-
with the transmitter node, but also the fact that an attack range strategic/random jamming attacks in wireless ad hoc networks,”
can be more effective than others depending on the specific in 2009 IEEE 34th Conference on Local Computer Networks, 2009, pp.
scenarios when multiple factors are evaluated all together, such 922–929.
[9] C. Commander, P. Pardalos, V. Ryabchenko, O. Shylo, S. Uryasev,
as detection time, energy consumption and effectiveness to and G. Zrazhevsky, “Jamming communication networks under complete
impact the communication. uncertainty,” Optimization Letters, vol. 2, pp. 53–70, 01 2008.
[10] M. Li, I. Koutsopoulos, and R. Poovendran, “Optimal jamming attacks
V. C ONCLUSION and network defense policies in wireless sensor networks,” in IEEE
INFOCOM 2007 - 26th IEEE International Conference on Computer
In this paper, we have studied the impact of the position of Communications, 2007, pp. 1307–1315.
three kinds of jammer as a function of the distance between [11] “Ns-3 website,” https://www.nsnam.org/, last accessed 14 June 2020.
the legitimate nodes of the networks. In order to evaluate [12] O. Osanaiye, A. Alfa, and G. Hancke, “A statistical approach to detect
jamming attacks in wireless sensor networks,” Sensors, vol. 18, 05 2018.
the performance of the different jamming nodes, we have
considered not only the impact on the PDR of the ”legitimate”
nodes, but also the detection time and the energy spent by the
attacker node. The key factors, namely, the energy consumption,

150
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

A Real-time Simulation Framework for Complex


and Large-scale Optical Transport Networks based
on the SDN Paradigm
A.A. Shah1,2 , M. Mussini4 , F. Nicassio3 , G. Parladori4 , F. Triggiani3 , G. Grieco1,2 , G. Iaffaldano1 , and G. Piro1,2
1
DEI, Politecnico di Bari, v. Orabona 4, 70125, Bari, Italy
2
CNIT, Consorzio Nazionale Interuniversitario per le Telecomunicazioni, Italy
Email: {awais.shah, giovanni.grieco, giuseppe.iaffaldano, giuseppe.piro}@poliba.it
3
Experis S.r.l., v. Kennedy 34, 20871, Vimercate, Italy
Email: {francesco.nicassio,francesco.triggiani}@it.experis.com
4
SM Optics S.r.l., v. P. Castaldi 8, 20124, Milano, Italy
Email: {marco.mussini,giorgio.parladori}@sm-optics.com

Abstract—Thanks to the recent advancements in the Software- architecture approach to virtually deploy network functions
Defined Networking (SDN) and Network Function Virtualization on generic hardware [1]. The state-of-the-art demonstrates
research domains, telecom operators are encouraged to upgrade that the combination of SDN and NFV enables unprecedented
their optical transport networks towards programmable, energy-
efficient, service-oriented, and interoperable architectures. The levels of network control, dynamicity, and flexibility [2]–[4].
availability of a large set of open-source building blocks, sup- Telco operators are encouraged to take this opportunity by
ported by different standardization bodies makes the selection integrating SDN and NFV into their large scale geographi-
and the integration of such technologies a very complex task. cal networks, eventually known as Transport-SDN (T-SDN).
In this context, the INTENTO project has the objective to However, this integration is not straight forward and poses
create an innovative simulation framework by selecting the
best technologies and use it to test applications, services, and numerous challenges due to the large scale complexity of
advanced optimization algorithms in a real environment. In the the telecommunication networks and the selection of suitable
initial phase, the project designed a large-scale, distributed, and technology from the available open-source projects backed by
hierarchical Transport SDN architecture, where optical switches different standardization bodies.
and networking functionalities are monitored and dynamically The INTENTO (INTElligent NeTwork Orchestration
configured through a two-level structure of SDN controllers. On
top of that, Virtual Network Functions are optimally deployed Framework) project [5], recently funded by the Apulia Region
and managed by a centralized orchestrator, based on network (Italy), is going to address the aforementioned issues by
condition, user requests, and application requirements. Based on developing an innovative simulation framework by selecting
this architecture, the project team started to develop a complex the appropriate state of the art technologies and integrate
simulation environment that harmoniously integrates within the them to test applications, services, and advanced optimization
OpenStack cloud: optical node simulators composed by simula-
tion agent and a suitable hardware emulation layer; proprietary algorithms in the real-time and complex T-SDN environment.
SDN network controller designed to enable the innovative optical In the initial stage of the project, a T-SDN architecture has
nodes characteristics; Open Network Operating System as the been designed, that incorporates distributed and hierarchical
second level controller, enabling the integration of third-party monitoring and deployment of large scale optical switches
or standardized models (multivendor environment), based on and network functionalities (i.e., VNFs) by means of a two-
standardized interfaces and communication protocols. After hav-
ing described the main components and functionalities already level structure of SDN controllers. The level-1 SDN controller
implemented into the simulation framework, the paper concludes manages the optical nodes, whereas the role of the level-2
by highlighting future research and development activities. controller is to allow the integration with third party and multi-
Index Terms—Optical Transport Networks; Software-Defined vendor environments. The Virtual Network Functions (VNFs)
Networking; Virtual Network Functions; Simulation Framework are optimally deployed via a central orchestrator based on the
network and user requirements.
I. I NTRODUCTION Based on the proposed architecture, the project team has
built a real-time and complex simulation environment within
Software-Defined Networking (SDN) is a cutting edge the OpenStack cloud, consisting of the following functionali-
technology for the deployment of programmable and virtu- ties: (1) Optical node simulators consisting simulation agents
alized service infrastructures. On the other hand, Network with the emulated hardware layer, (2) the level-1 proprietary
Function Virtualization (NFV) emerged as a new network SDN controller developed as the part of this project to manage
978-1-7281-7343-6/20/$31.00 ©2020 IEEE the advanced optical nodes features, and (3) Open Network

151
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Operating System (ONOS) has been adopted as the level-2 The extensive use of a virtualized environment, allow the
SDN controller in order to facilitate integration for third party mix of real nodes and simulators and can be used to reach a
and multi-vendor environments on standardized communica- very high node count, to test very complex networks.
tion protocols like Transport API (T-API), NETCONF, and The Project objectives are supported by an IT infrastruc-
RESTCONF, for both southbound and northbound interfaces. ture based on standard servers. More precisely, we aim at
It is worth noting that in this architecture real equipment can be demonstrating the overall framework capability of simulating
integrated into the simulation environment. Automated deploy- a complex infrastructure management, including optical layer
ment procedure has been developed to effectively deploy the design and planning, multilayer (DWDM / OTN / Packet)
complete simulation environment. At the time of this writing, management. The simulation framework can be used to carry
to certify the effectiveness of the overall simulation framework out complex simulation scenarios, very useful to select the
the project identifies innovative applications, which will be best solutions among alternative option and assess the overall
addressed in the final part of the INTENTO Project. performance. In addition, thanks to the open framework archi-
The rest of the paper is organized as follows: Section 2 tecture, advanced applications will be selected and tested, i.e.,
presents the overview, goals, high-level architecture of the IN- the effectiveness of a set of VNFs, which may be hosted in
TENTO project. Section 3 describes the introduced simulation the optical nodes.
framework, implemented technologies, and the future goals
of the project. Section 4 draws the conclusions of this work. A. Targeted use cases
Finally, the acknowledgments are given in Section 5. Although embedding multi-level service orchestration ar-
chitectures in nodes of a telecommunication network is often
II. T HE INTENTO PROJECT seen as a way for serving traditional Telco applications as
VNFs, this vision leaves out interesting target areas that are
INTENTO Project has the target to implement a Telco- non-purely Telco. Indeed, several ICT applications exist that
Cloud orchestration platform using open source software either demand or greatly benefit, from the availability of edge-
modules and standardized interfaces. The telecommunication based processing coupled with synchronous inter-node com-
infrastructure includes all the relevant hardware and software munication, in terms of low latency, availability, survivability,
components of a T-SDN architecture, ranging from optical as well as scalability, especially when all is cleanly modeled as
nodes through a two-level network management system, to VNFs and orchestrated as such. For example: Content Delivery
the overall infrastructure management based on a centralized Network (CDN) for efficient Video distribution, Blockchain
orchestrator. On top of that, VNFs are optimally deployed and processing VNF, Camera processing VNF for social distanc-
managed by a centralized orchestrator in order to implement ing and face mask-wearing rule infringements detection for
and test innovative services and applications. anti-COVID-19 precaution, IoT data collection and first-level
aggregation VNF for sensor arrays, vehicle traffic support
systems, and smart grid applications etc.
B. High-level architecture
Referring to Figure 1, the items composing the overall
architecture are:
1) Telecom Nodes: The simulator of the optical nodes is
based on the SM Optics technology and are the base of the
Telecom infrastructure, providing the connectivity for each
architectural component.
2) Specialized SDN Controller (L1 Controller): The L1
Controller is in charge to manage the Telecom Nodes sim-
ulation instances and represent the actual NMS solution for
SM Optics nodes.
Fig. 1. The conceived high-level framework. 3) Multi-vendor/Multi-domain SDN Controller (L2 Con-
troller): The L2 controller is supposed to act as a generic SDN
Figure 1 describes the reference architecture of the simula- controller, based on standard interface, enabling the simulation
tion framework, highlighting the driving factors: 1) develop- to deal with multi-domain, multi-vendor environment.
ment of the node models according the Yet Another Next Gen- 4) VNF Orchestrator: Provide the support for the whole
eration (YANG) standard, 2) support of T-API interface and lifecycle of VNF instances, from library management to the
NETCONF/RESTCONF communication protocols to ensure actual deployment, to the activation and monitoring functions.
the compatibility with existing network standards, 3) support 5) Framework Orchestrator: It is based on Openstack and
of L1-L3 network layers and related multi-layer management, should be intended as the general orchestrator framework
and 4) implement a NFV infrastructure management enabling needed to exploit all possible services conceived for the
the development and testing of VNFs. proposed infrastructure.

152
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

III. T HE IMPLEMENTED TESTBED the required dependencies for the deployment of the level-
1 SDN controller and optical node simulator are packed
This section focuses on the technical details related to the
inside the containers to make the application portable and
implemented simulation framework, while presenting: selected
easily deployable.
technologies, integrated components, and automated deploy-
• Operating environment: The OpenStack cloud has been
ment. An example showing the current usage of the simulation
selected as the Operating environment for the simulation
framework is discussed as well.
framework. It can virtualize and control large pools of
A. Components of the simulation framework computing, storage, and networking resources. Open-
Stack has been chosen because of its opensource licens-
The developed simulation framework consists of: ing, wide adaptation in the industry, active community
• Network Simulation Agent: The network simulation support, and frequent releases of new features as per
agents (also known as optical node simulator) are devel- industry demands [6].
oped to model virtualized optical switches in the network
comprising different characteristics i.e., bus speed, num- B. Communication protocols and interaction
ber of connecting ports, and type of connectors etc. Each The network configuration and communication between
virtual switch can be connected to one or multiple optical the components of the simulation framework is carried-out
switches in the simulation environment. through the following protocols:
• Level-1 SDN controller: A proprietary SDN controller • T-API: It is a transport protocol that delivers a flexible
has been designed and developed as part of the IN- North-Bound Interface for integrating SDN controllers
TENTO project to enable the management and control in the network by facilitating transport communication
of simulated optical nodes. The core responsibility of the through REST API following T-API models, written in
mentioned controller is the creation and management of YANG.
the virtualized optical switches on the network simulation • RESTCONF/NETCONF: The purpose of these protocols
agents connected to it. Moreover, it is also responsible is the communication between multiple controllers and
for communication with the multi-vendor supportive SDN network simulation agents. They provide mechanisms
controller ONOS on level-2. In our proposed simulation to install, manipulate, and delete the configuration of
environment level-2 SDN controller is connected with a network devices through remote procedure calls and
bunch of level-1 SDN controllers associated with an enor- XML/JSON based data encoding for the configuration
mous amount of Network Simulation Agents comprising data as well as the protocol messages.
several virtualized optical switches.
• Level-2 SDN controller: For the selection of level-2 SDN C. Achieved implementation and the developed simulation
controller, despite, several open-source controllers avail- framework
able in the industry, the most prominent ones are ONOS The current simulation framework being developed, inte-
and OpenDaylight. The aforementioned controllers allow grates deployment of two-level of SDN controllers, within
communication with third-party controllers through the the OpenStack cloud. The proprietary SDN controller devel-
well-known communication protocols available in the oped in this project is deployed on level-1 and an Open-
industry (i.e., OpenFlow, NETCONF, and RESTCONF). Source multi-vendor supportive ONOS SDN controller has
The motivation behind the selection of ONOS in the been placed on level-2 in the framework. The optical node
INTENTO framework is its communication mechanism. simulator, which is also developed as the part of this project
ONOS provides support for T-API protocol at the South- is connected to the level-1 and level-2 controllers via the
bound interface over the REST protocol. In the contrast, RESTCONF/NETCONF interfaces. As shown in Figure 2, the
OpenDaylight earlier provided support for T-API in their optical node simulator is dynamically controlled by the level-1
UniMgr project but in the recent releases of ODL, there SDN controller. Each optical node simulator represent telecom
is no support for T-API, which is the provision in the nodes that can create a large number of virtual interfaces for
INTENTO project for communication between the level- communication with other telecom nodes. The level-1 SDN
1 and level-2 controllers. controller is connected with the level-2 SDN controller through
• Modeling language: YANG is a data modeling language the T-API interface using the REST API. The level-2 SDN
for the definition of data sent over network management controller can retrieve the information related to simulated
protocols such as the NETCONF and RESTCONF. It is nodes either through the level-1 SDN controller or directly
used in our project to model both configuration data as from the optical nodes. The topology related information
well as state data of elements in the network. is retrieved through the level-1 SDN controller. The YANG
• Application deployment technology: Containers technol- language is used for the communication models between the
ogy provides an effective way for application deployment. level-1 and level-2 SDN controllers as well as the optical
Docker has been selected as the container engine, based node simulators. Currently, the level-1 SDN controller can
on a qualitative cross-comparison of technologies for communicate and control the optical node simulators and
containerization discussed in [6]. All the executables and perform tasks such as creating multiple interfaces on the

153
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

simulation node, connecting multiple nodes, and defining a


network topology. This information is made available at the
controller’s northbound through the T-API interface. As for the
level-2 SDN controller, it is introduced to control and abstract
large-scale networks individually handled by each level-1 SDN
controller. An customized adapter has been developed as part
of this project for the communication between the level-1 and
level-2 SDN controllers based on T-API in order to transfer
the knowledge of the network topology and other information
Fig. 3. A virtual telecom node representing multiple interfaces on the optical
from the level-1 SDN controllers to the orchestrator. node simulator.

has been developed within the OpenStack cloud consisting


functionalities of optical node simulation agents, two-levels
of SDN controllers (one for managing and simulating the
advanced optical nodes and other for integrating multi-vendor
and third party environments), and communication protocols.
A demo of creation of the simulation nodes is also presented
in this paper. The future work of the project consists of
design and implementation of the routing strategies that ensure
energy efficiency and quality of service in the network, Power
Management and monitoring applications comprising YANG
models for the energy optimization of the framework, imple-
mentation at the ONOS southbound for retrieving the actual
power consumption data from the optical node simulators and
instruct the standby/wakeup statuses, SNMP Management of
level-1 SDN controller at the ONOS, and connectivity of the
ONOS with the orchestrator. These objectives, when achieved
will be discussed in the future versions of the paper.
Fig. 2. The developed simulation framework.
V. ACKNOWLEDGMENTS
D. Automated deployment This work was mainly supported by the Apulia Region
(Italy) Research project INTENTO (36A49H6). It was also
An automated procedure has been developed using Ansi- partially supported by the PRIN project no. 2017NS9FEY
ble, containing YAML files. It creates a 3-layer simulation entitled “Realtime Control of 5G Wireless Networks: Taming
environment that installs and integrates the level-1 and 2 the Complexity of Future Transmission and Computation
SDN controllers along with optical node simulators within Challenges” funded by the Italian MIUR.
the OpenStack cloud on the given remote system. The aim of
developing an assisted procedure is to minimize the installation R EFERENCES
time and reduce the complexity required for the creation of the [1] H. Hawilo, A. Shami, M. Mirahmadi, and R. Asal, “Nfv: state of the
simulation environment for testing. The base requirement for art, challenges, and implementation in next generation mobile networks
the automated procedure is a local or remote system running (vepc),” IEEE Network, vol. 28, no. 6, pp. 18–26, 2014.
[2] B. Lakshmi and J. Lakshmi, “Integrating service function chain man-
Ubuntu operating system. agement into software defined network controller,” in 2019 IEEE World
Congress on Services (SERVICES), vol. 2642. IEEE, 2019, pp. 160–165.
E. Demo of the simulated optical nodes [3] M. Mechtri, C. Ghribi, O. Soualah, and D. Zeghlache, “Nfv orchestration
framework addressing sfc challenges,” IEEE Communications Magazine,
As shown in Figure 3, four interfaces are created on the vol. 55, no. 6, pp. 16–23, 2017.
optical node simulator representing a telecom node using [4] M. Garrich, F.-J. Moreno-Muro, M.-V. B. Delgado, and P. P. Mariño,
the level-1 SDN controller within the conceived simulation “Open-source network optimization software in the open sdn/nfv transport
ecosystem,” Journal of Lightwave Technology, vol. 37, no. 1, pp. 75–88,
environment. 2019.
[5] G. Boggia, L. A. Grieco, C. Guaragnella, M. Ruta, M. Forzani, F. Nicas-
IV. C ONCLUSIONS AND FUTURE WORK sio, V. Simone, M. Mussini, and G. Parladori, “It and optical network
This paper presented the vision and goals of the industrial orchestration framework: An industrial research project,” 2018.
[6] A. A. Shah, G. Piro, L. A. Grieco, and G. Boggia, “A qualitative cross-
project INTENTO to achieve an innovative simulation frame- comparison of emerging technologies for software-defined systems,” in
work for the T-SDN paradigm by selecting the best technolo- 2019 Sixth International Conference on Software Defined Systems (SDS).
gies and use it to test services and applications in a real en- IEEE, 2019, pp. 138–145.
vironment. The high-level architecture of the project has been
presented and a real-time complex simulation environment

154
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

An Energy Management System at the Edge


based on Reinforcement Learning
F. Cicirelli, A. F. Gentile, E. Greco, A. Guerrieri, G. Spezzano, A. Vinci
National Research Council of Italy - Institute for High Performance Computing and Networking (ICAR)
Rende (CS), Italy
(franco.cicirelli, antoniofrancesco.gentile, emilio.greco, antonio.guerrieri, giandomenico.spezzano, andrea.vinci)@icar.cnr.it

Abstract—In this work, we propose an IoT edge-based energy of Things (IoT) paradigm, and cognitive abilities. Cognitive
management system devoted to minimizing the energy cost for the buildings differ from smart buildings because they can learn,
daily-use of in-home appliances. The proposed approach employs reason, adapt, and cooperate with each other to make decisions
a load scheduling based on a load shifting technique, and it is
designed to operate in an edge-computing environment naturally. in a time-constrained fashion. A key feature is to collect
The scheduling considers all together time-variable profiles for and analyze environmental data and consumer habits infor-
energy cost, energy production, and energy consumption for each mation to proactively operate in order to efficiently manage
shiftable appliance. Deadlines for load termination can also be its resources (for example, its internal and external spaces,
expressed. In order to address these goals, the scheduling problem technological infrastructures and systems) and with the goal
is formulated as a Markov decision process and then processed
through a reinforcement learning technique. The approach is of improving energy performance and raising the level of well-
validated by the development of an agent-based real-world test being and safety of the building inhabitants.
case deployed in an edge context. Applying DSM techniques is not a trivial issue because
Index Terms—Edge Computing, Reinforcement Learning, En- they require to consider several factors together in order to
ergy Management Systems, Internet of Things, Multi-Agent minimize energy costs. As an example, it is necessary to
Systems.
take into account the presence of PV (Photo-Voltaic) solar
panels that gives rise to the problem of using as much as
I. I NTRODUCTION
possible the self-produced energy so to minimize the cost of
Appliances and new technologies simplify everyday life, buying energy from the electrical grid [5]. Another factor that
making it easier to carry out daily domestic activities and increases the complexity of applying DSM is the variable price
guarantee surprising results. There are several types of avail- of energy [6].
able appliances: for cooking (e.g., hoods, hobs, ovens, and An effective way to implement DSM techniques is to stim-
microwaves), for making food (e.g., small appliances such ulate customers to shift loads from peak periods to off-peak
as blenders, mixers, and food processors), for keeping food periods or decrease their electricity usage during peak times.
fresh (e.g., refrigerators, cellars), for cleaning (e.g., vacuum All of this, always taking into account people’s preferences
cleaner), for washing (e.g., washing machines, dishwashers), in using appliances and the self-produced energy. In such a
for entertainment (e.g., smart TV, game console, home the- contest, Reinforcement Learning (RL) algorithms can make
aters), and for indoor wellness (e.g., air conditioners and the right decision in computing and operating an appropriate
purifiers). Inefficient use of these appliances causes a waste load scheduling [7].
of energy and time. One simple way to deal with this is DSM can significantly benefit for the exploitation of the
to provide consumers with some feedback. Feedback can be Edge Computing paradigms [8], [9]. The key concept of edge
used to inform about waste and give suggestions or best computing is distributing the power of data processing to the
practices to enhance user behavior regarding energy. In such edge of a system, giving to devices, sensors, and gateways the
a case, the user remains in charge of actuating proper actions capabilities to act or make decisions locally without relying on
to deal with energy management. Another more effective a far cloud environment. In such a case, advantages for DSM
way to reduce energy consumption is to apply Demand Side are tied to latency reduction, bandwidth saving, and privacy
Management (DSM) techniques [1]. DSM in the smart grid preservation [10]–[12]. An edge-based DSM can be replicated
allows customers to make autonomous decisions on their and disseminated on different computing nodes residing in dif-
energy consumption, helping energy providers to reduce the ferent places to execute locally in every building, thus taking
energy peaks in load demand. The automated scheduling of into account the specific context in which the system operates.
smart devices in residential and commercial buildings plays a Moreover, in order to foster high performance in using an
key role in DSM [2]. RL approach for DSM, edge computing favors efficient data
DSMs can be considered as a part of the so-called modern transfer between the RL algorithm and the physical devices
Cognitive Buildings [3]. Cognitive buildings, the natural evo- that must be managed. As another side benefit, edge computing
lution of the smart buildings [4], are environments equipped reduces costs as it permits to avoid buying cloud resources.
with sensors and actuators capabilities that exploit the Internet In this paper, we focus on the design and implementation

978-1-7281-7343-6/20/$31.00 ©2020 IEEE


155
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

of an IoT-based energy management system for DSM, that agent incrementally updates its knowledge about the problem,
exploits reinforcement learning and is based on an edge and, eventually learns which are the actions to take for the
computing infrastructure. The goal is to schedule in-home maximization of the reward.
appliances to minimize the total cost spent on energy and take
into account some users’ preferences. The scheduling process B. Markov decision processes
is modeled as a Markov decision process (MDP) [13], and RL A Markov decision process (MDP) [13] is an extension of
is used to determine an effective policy that aims to minimize the Markov Process that embeds the concepts of actions and
the energy cost. rewards [13]. An MDP is defined as a tuple (S, A, Pa , Ra )
The contribution of the paper considers all together: (i) where:
time-variable profiles for energy cost, energy production, and
• S defines the set of the states;
energy consumption for each appliance; (ii) the presence of
• A defines the set of the actions;
both not-shiftable and always-on loads in a building; (iii) the ′
• Pa (s, s ) defines the transition probabilities, i.e., given a
definition of deadlines within which the appliances have to be
couple of states s, s′ ∈ S and an action a ∈ A, Pa (s, s′ )
executed. Moreover, (iv) the system has been designed to be
is the probability to move from the state s to the state s′
naturally distributed and capable of being deployed on several
with the action a;
edge nodes that can be spread in different parts of a building
• Ra (s) defines the reward obtained by taking the action
to exploit the edge-related advantages fully.
a ∈ A when in the state s ∈ S.
To prove the effectiveness of the approach, a real case study
has been implemented in the contest of the IoT Laboratory at Given an MDP, it is possible to define a policy function
ICAR-CNR (Rende, Italy). For realization purposes, the agent- π(s) which gives, for each state s ∈ S, an action a ∈ A to
based COGITO IoT platform [14] has been used. Developed at undertake. A policy function is optimal if it maximizes the
ICAR-CNR, COGITO proved to be effective for the realization expected cumulative reward of an MDP.
of cognitive building applications. For MDPs having a finite state space, a finite action space, a
The remainder of the paper is structured as follows: Sec- fully defined probability transition function, and a fully defined
tion II introduces some background concepts useful to under- reward function, it is possible to compute an optimal policy by
stand the rest of the paper and some related work; Section III exploiting dynamic programming. In the other cases, specific
describes problem statement and how it is modeled. Moreover, RL algorithms can be exploited to estimate effective policies.
it introduces the simulated environment and the used reward
C. COGITO
function. Section IV shows a case study implemented and
gives some experimental results. Finally, some conclusions are COGITO [14] is an agent-based IoT platform tailored to
drawn and future work are provided. the design and implementation of cognitive environments. A
cognitive environment extends the concept of smart environ-
II. BACKGROUND AND RELATED WORK ment [17], [18] by promoting the exploitation of cognitive-
This section provides some fundamental information about based technologies [19] which aim at realizing systems able
the concepts exploited in the rest of the paper. It also provides to automatically adapt to changes in user’s behavior and
a view on some other works in the literature having similar anticipating and predicting users’ activities and needs.
approaches with regard to the topics of this paper. COGITO, currently implemented in Java, relies on the agent
metaphor [20] and naturally permits to exploit the benefit of
A. Reinforcement learning both edge and cloud computing. Agent paradigm has been
RL is a technique useful for training decision-maker agents. chosen since it is well suited for the implementation of dis-
Such technique is studied in different fields, e.g., game theory, tributed and pervasive systems. In fact, agents can execute near
optimization, and control theory. RL considers four basic to the devices they need to control/manage thus implementing
components: agent, environment, reward, and action. An agent the edge computation and enabling real-time analysis on the
can observe a dynamic environment and interact with it by data gathered on single nodes. Furthermore, agents can execute
taking actions. A reward is given to the agent for each action on the cloud for implementing out-of-the-edge functionalities
it takes. Agents are trained by RL so to make decisions that (e.g., data mining or data storage) and taking advantages of the
maximize the given (cumulative) rewards. RL algorithms are features of the cloud. The COGITO platform offers the Virtual
useful to train decisors that operate with limited knowledge Objects abstraction, suited to hide heterogeneity of physical
of both the environment and the expected quality of each devices and communication protocols. COGITO promotes
decisions they take [7]. modularity and separation of concerns and offers some built-in
RL algorithms include State-Action-Reward-State-Action features which can be exploited to aggregate/filter information
(SARSA), Q-Learning, Deep Q-Learning, and Asynchronous at the edge and operating data-fusion on data coming from
Advantage Actor-Critic (A3C) [7], [15], [16]. Such algo- the deployed sensors. Other primitives are made available to
rithms are all based on learning-by-experience. The agent is simplify the use of artificial intelligence libraries (e.g., for
trained by running a set of simulations in which the agent machine learning) in a distributed and heterogeneous cross-
interacts with the environment. After each simulation, the language environment.

156
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

D. Load scheduling and reinforcement learning before the activation of the load i and after it completes
In literature, other works have tackled the problem of its execution, that is if ta < 0 or ta ≥ αi .
appliances scheduling through RL. In [21], authors have Both ci and Ci define the consumption profiles of the
compared variations of SARSA and Q-learning algorithms by load i;
applying them for finding a scheduling of six loads. Anyway, • the function p(t) is the price per kWh of the energy at

they considered a time granularity in the scheduling of one time-step t;


hour (so permitting long holes in the appliances scheduling) • the function e(t) is the self-produced energy, in kWh

and did not consider self-produced energy. In [22], authors (e.g., by a PV solar panel) at time-step t. It is possible
propose a load scheduling algorithm using RL to minimize to consider both real and forecasted profiles for energy
the electricity bill considering a renewable energy source. The production. The use of self-produced energy is considered
paper presents some interesting scheduling of six loads based to be free of charge.
on time granularity of one hour. Moreover, it does not consider • the function kpeak (t) is the maximum permitted instan-

real deployments of the implemented system on a distributed taneous energy consumption, in kW, at time-step t. By
environment. The work at [23] introduces CAES as an energy reducing this function, it is possible to model the presence
management system for residential demand-response applica- of not-shiftable or always-on loads.
tions. CAES is based on Q-learning and has been developed The output of the approach is a vector L = [l1 , li , ..., lN ]
to adapt to changes in a consumer’s preferences or in the in which, for each load i, is represented the time-step t when
energy market prices. Although very interesting, this work such load has to be activated.
does not take into account time variable profiles regarding self- The approach pursues the goal of finding a scheduling which
production of energy or energy consumption of the appliances. minimizes the cost of executing all the loads:
III. P ROBLEM STATEMENT AND MODELING minimize : Ctot (1)
The goal of the proposed approach is the definition of a where Ctot is defined as:
schedule of load activations which aims at minimizing the ! !!
D−1 N
costs for energy and avoiding the overcoming of consumption X X
Ctot = (p(t) ∗ max 0, Ci (t − li ) − e(t)
peaks.
t=0 i=1
The scheduling algorithm evolves on a step-basis. At each (2)
step, the agent sees the environment and chooses which loads A scheduling should also address the following constraints:
can be activated. An activated load cannot be suspended.
The duration of the steps is configurable. As an example, D − li ≥ αi , ∀i (3)
by considering time steps with a duration of 15 minutes, the N
X
model considers 96 scheduling time-steps in a day. vβi = 0 (4)
i=1
A. Parameters and goals
D−1
Specifically, the approach considers as inputs the following X
vki = 0 (5)
parameters:
i=0
• the number N of shiftable loads to be scheduled;
where:
• the interval T , in minutes, which is the duration of a
• Vβ = [vβ1 , vβ2 , ..., vβN ] is a vector representing, for each
single time-step in the considered scheduling;
• the maximum number of time-steps D admitted for
load, its deadline violations expressed in time-steps, i.e.,
scheduling the loads; the number of time-steps that overcome the related βi
• for each load i, the boolean parameter σi which is its
parameter;
• Vk = [vk1 , vkt , ..., vkD ] is a vector of D elements
initial setup, i.e., on (1) or off (0);
• for each load i, the parameter βi which is the deadline
showing, for each time-step t, the amount of kW that
of the load execution, expressed in time-step; exceeds the kpeak t function.
• for each load i, the parameter αi describing the remaining The constraint described in Equation (3) has the aim of
execution time, expressed in time-steps, for each load guaranteeing the loads complete their execution during the D
once it is activated; time-steps of the scheduling.
• for each load i, the function ci (ta ) is the maximum in- Since we are interested to schedule the loads by exploiting
stantaneous consumption of the load i at the ta time-step an anytime heuristic, we consider also as admissible a schedul-
since its activation, expressed in kW. ci (ta ) is considered ing in which the constraints (4) and (5) do not hold. Thus, the
0 before the activation of the load i and after it completes approach will try to obtain a scheduling which minimizes the
its execution, that is if ta < 0 or ta ≥ αi . following:
• for each load i, the function Ci (ta ) is the cumulative
N
consumption of the load i at the ta time-step since its X
vβi (6)
activation, expressed in kWh. Ci (ta ) is considered 0 i=1

157
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Algorithm 1 The nextState transition function


D−1
X Input: H, t, α, β, a
vki (7)
1: H ← H | a // bitwise OR between H and a
i=0
2: for i = 1 to N do
B. Problem modeling 3: if (Hi = 1) then
The above described scheduling problem is reduced to a 4: αi ← αi − 1
RL problem which exploits the agent-environment paradigm 5: else
shown in Figure 1. 6: βi ← βi − 1
7: end if
8: end for
9: r ← reward(H, t, α, β)
10: t ← t + 1
11: return r, H, t, α, β

The representation of the environment exploited by


the agent is modeled as a Markov Decision Process
Fig. 1: The agent-environment model for Reinforcement hS, A, Pa , Ra i, where the transition immediate rewards Ra
Learning are not known. Given a problem instance with N loads and a
duration of D time-steps:
The agent is a decisor that, given an observed state of
the environment, can perform a specific action on it. In our • the set of states S is given by all the possible binary states
approach, an observed state contains information about the of the loads multiplied by the admissible time-steps. The
current time-step and about which loads have been activated state of a load is defined as 0 if the load has not been yet
before the current time-step. Given an observed state, an agent activated or 1 if the load is active or it has terminated its
can request to activate a subset of the loads, or do nothing. execution. Each state s is identified by a couple hHs , ts i
When the (simulated) environment is requested to do an action, where Hs is a boolean vector of N elements regarding
it updates its internal state consequently, and produces a new the states of the loads, and ts is the system time-step. For
observed state and a reward value for the agent. example, the state identified by h010, 15i is the system
For each problem instance, an agent is trained by using RL. state in which at time-step 15 only the second load (out of
The simulated environment furnishes a reward that gives prizes three loads) is active/activated. As a consequence, |S| =
that increase as the overall energy cost reduces, and gives |D| × 2N .
penalties as the previously-defined constraints are violated. • the set of admissible actions A is given by all the possible
More in details, such penalties grow as the (6) and (7) increase. boolean vectors a = a1 , . . . , aN where, for a given load
A simulation starts at the time-step t = 0 and ends after D i, a value of 1 at ith position represents the request for
time-steps. its activation, thus |A| = 2N .
For a given problem instance, the agent runs a set of • the transition probabilities Pa (s, s′ ), that defines the
simulations having the same starting condition which is built probability to move from state s = hHs , ts i to state
from the problem instance. Each simulation is referred as s′ = hHs′ , ts′ i given the action a, is defined as follows:
episode. From each episode, the agent incrementally learns
how to behave in order to maximize the cumulative reward 
1,
if(Hs′ = Hs | a) ∧ (ts′ = ts + 1);
by exploiting the Q-learning [24] algorithm, in which a state- Pa (s, s′ ) =
0, otherwise.
action-value function Q(s, a), initialized to 0 for each s and
(9)
a, is updated according to the following:
where | identifies the bitwise or operation. In other
words, given an action a, computing the reached state
Q(st , at ) ← Q(st , at ) + δ[rt +
is deterministic.
γ max Q(st+1 , a) − Q(st , at )] (8)
a It is worth noting that this model purposely does not
where st is the observed state of the environment at time-step explicitly take into account many of the input parameters listed
t, at is the action at time-step t, δ is the learning rate, γ is in Section III-A, as well as other environmental variables,
the discount factor and rt is the reward at time-step t. Given introduced in the next section, that are not directly observed by
a state s, the decisor agent chooses the action a following the agent. For instance, the model does not take into account
the epsilon-greedy method, i.e., chooses the action a that whether a load i has completed its execution. In order to face
maximizes the Q(s, a) or a random action with a probability ǫ with the well-known state explosion problem, such parameters
which decreases each time-step, thus favoring the exploration and variables are implicitly considered in the reward that the
of different solutions. agent receives after it has undertaken a given action.

158
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Algorithm 2 The reward function Equation (2). It is worth noting that the contribution (E) takes
Const: c, C, p, e, kpeak into account the self-produced energy.
Input: H, t, α, β, a
IV. A SYSTEM FOR E NERGY M ANAGEMENT AT THE EDGE
1: r ← 0
2: maxPrice ← max(p(t)) To validate the proposed approach, a prototype system was
3: maxCost ← kpeak ∗ T ∗ maxPrice implemented in the IoT Laboratory at ICAR-CNR (Rende,
4: totCumulC ← 0 // current total cumulative consumption Italy). The system permits a user to request for a schedule of
5: totInstC ← 0 // current total instantaneous consumption loads in an home environment, then it finds such a schedule
6: for i ← 1 to N do by exploiting the proposed approach. The found schedule has
7: if (αi = 0) then to be approved by the user and, if accepted, it is applied on
8: r ← r + 10 // contribution (A) the real loads. Users’ interactions are mediated by a dedicated
9: end if GUI.
10: if (αi > 0 ∧ βi ≤ 0) then
A. Use Case Design
11: r ← r − 1 // contribution (B)
12: end if The architecture of the system is depicted in Figure 2. The
13: if (Hi = 1 ∧ αi > 0) then components shown as ellipses correspond to agents, the rect-
14: r ← r + 1 // contribution (C) angles to virtual objects (see Section II-C), and the hexagons
15: totCumulC ← totCumulC + Ci (t) to physical devices. Arrows between components highlight
16: totInstC ← totInstC + ci (t) communications, the grey-label near an arrow specifies the
17: end if exchanged data. Such data comprehends the parameters ex-
18: end for plained in Section III-A. A description of the software agents
19: if (totInstC > kpeak ) then follows.
20: r ← r − 1 // contribution (D) • Scheduling Manager: is in charge of gathering all the
21: end if parameters required for the scheduling to be calculated.
p(t)∗(totCumulC−e(t))
22: r ← r − maxCost // contribution (E) When all the parameters are available, it forwards them
23: return r to the Scheduling Finder and waits for receiving the
scheduling. Once the scheduling is accepted by the user, a
message is sent to the loads containing the time in which
C. The simulated environment and the reward function they have to be activated;
• Scheduling Finder: is devoted to perform the RL al-
The simulated environment is characterized by its state gorithm described in Section III-B, thus furnishing the
Es = hH, t, α = {α1 , . . . , αN }, β = {β1 , . . . , βN }i and by computed scheduling;
the nextState transition function that given an action a and a • Energy Manager: it is responsible of furnishing the
state Es determines a new state Es′ and a reward r. Hs and energy price per kWh by requesting such data to the smart
ts are made available to the decisor agent. Beside this state power grid;
information, the environment receives, as constants, the set of • Energy Production Manager: is the component that fur-
parameters specified in Section III-A. nishes the profile of the self-produced energy available in
The nextState function is shown in the Algorithm 1. Given the system. For testing purposes, in order to consider re-
the action a, H is modified so as to activate the loads not alistic curves, the estimated production profile is derived
yet activated that are requested by the action a. Then, αi is through the use of a small PV solar-panel;
decreased for each activated load and βi is decreased for each • Mirror Loadi : it manages the Loadi through the related
not activated load. Finally, the time-step t is increased by 1. virtual object V Oi , monitors the energy consumption
The reward function is computed as described in Algorithm profile of the load, and furnishes such information to the
2. The reward is composed by considering five contributions. scheduling Manager when needed. It is also responsible
The contribution (A) is the termination reward which gives to turn on the load at the correct time, as requested by
a prize when a load completes its execution (i.e., αi = 0). the Scheduling Manager;
The contribution (B) gives a penalty for each βi constraint • Mirror PV panel: by using the virtual object V OP V ,
that cannot be met, thus permitting to find a schedule that it monitors the energy produced by the PV solar panel
minimizes the Equation (6). The contribution (C) gives a and forwards such information to the Energy Production
prize for each active load, thus favoring to find a schedule in Manager.
which all the loads complete their execution. The contribution
(D) gives a penalty when the total instantaneous consumption B. Use Case Realization
overtakes the kpeak threshold, thus favoring to find a schedule The software components described above are deployed on
which minimizes the Equation (7). Finally, the contribution two edge nodes each made by a Raspberry Pi 3 and hosting
(E) gives a penalty which is proportional to the energy cost at the COGITO platform. More in detail, the first node is directly
the current time-step, thus favoring schedules that minimize connected to all the physical devices, and hosts all the Mirror

159
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Energy
Energy Mirror PV
Production VOPV
Manager PV panel Panel
Manager
p(t)
e(t)

N, T, D, σ, α, α, c, C,
β, c, C, p, e, k σi

Scheduling Scheduling Mirror


Mirror
Mirror VO Loadi
VOi
VO ii Load
Loadi
finder Manager Load
Load
Loadii i
i
li
L

N, D, β

GUI

Fig. 2: The architecture of the implemented appliance scheduling management system

agents, the related virtual objects, the Scheduler Manager, We assume that each load has a constant power
the Energy Production Manager, and the Energy Manager. consumption during its execution
The second node hosts only the Scheduling Finder, which is • c = {c1 , c2 , c3 , c4 , c5 , c6 } = {0.5, 0.7, 1.8, 0.5, 0.8, 1.2},
the most computing demanding software component in the in kW. We assume that each load has a constant maximum
system. The virtual objects interact with the physical devices peak consumption during its execution
by exploiting the MQTT protocol. • kpeak = 2, in kW
For the case study, the loads are realized by considering • p(t), price per kW per time unit T , defined as:
six different bulbs, each of them characterized by a differ- 
0.093, 32 < t < 76
ent energy profile. Each load is controlled by a POWR2 p(t) = (10)
0.053, otherwise
sonOff Smart Switch (https://sonoff.tech/product/wifi-diy-
smart-switches/powr2) enhanced with the TASMOTA Open • e(t), self produced energy in kW per time unit T , defined
source firmware (https://tasmota.github.io/docs/), exploited as:

also to monitor the energy consumption of each load. The 0.3, 32 < t < 48 ∨ 64 < t < 72
hardware components described so far have been deployed on e(t) = 0.8, 48 < t < 64 (11)
a purposely-realized demo panel (see Figure 3). The panel 
0.0, otherwise
presents some plugs that can host not-shiftable or always-on
This function has been obtained by rounding the produc-
loads. A further sonOff has been added to the panel to monitor
tion profile observed from the PV solar panel.
all together the loads (both schedulable and always-on) and to −0.008∗epnumber
• ǫ = 0.001 + (0.25 − 0.001) ∗ e : the
take into account the presence of possible leakage current.
probability of choosing a random action following the
Such sonOff emulates a standard home electric meter.
ǫ-greedy policy decreases exponentially as the episode
A further hardware component realized for the case study
number epnumber increases.
is the PV solar panel used for the estimation of the energy −0.008∗epnumber
• δ = 0.1 + (0.4 − 0.1) ∗ e , the learning rate
production profile. Such panel is installed on a window at the
for the Q-learning function decreases similarly to the ǫ
laboratory and it is shown in Figure 3c.
probability.
• γ = 0.99, the discount factor for the Q-learning function.
C. Experimental Results
Figure 4 shows, respectively, how the rewards, the peak
Some preliminary results are shown in Figure 4. The results violation number, the beta violations number, and the total
refer to a scenario, in which a scheduling has been requested scheduling cost evolve during the training of the scheduler,
at time-step t = 60 with the parameters listed in the following: considering 500 episodes. More in detail, Figure 4a shows,
• D = 96 for each episode, the cumulative reward obtained. The reward
• N =6 is of about 70 after 200 episodes. Later on, the peak violations
• T = 15m are eliminated (see Figure 4b and the algorithm continues to
• α = {5, 2, 7, 2, 2, 2} try and learn how to reduce the Beta violations(see Figure 4c).
• β = {10, 11, 12, 6, 6, 6} The reached minimum on the Beta violations, at about episode
• C = {C1 , C2 , C3 , C4 , C5 , C6 } = 430, causes a increment on the total cost (see Figure 4d), but
{0.5, 0.7, 1.8, 0.5, 0.8, 1.2}, in kW per time unit T . a slight increment in the total reward. It is worth to note that

160
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

(a) (b) (c)


Fig. 3: The Demo Panel: (a) Front View; (b) Rear View; (c) PV Solar Panel.

(a) Reward vs episodes (b) Peak violations number vs episodes

(c) Beta violations number vs episodes (d) Total cost vs episodes


Fig. 4: Experimental results.

161
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the convergence is guaranteed since both ǫ and δ exponentially [5] G. Belli, A. Giordano, C. Mastroianni, D. Menniti, A. Pinnarelli,
decrease, thus after a certain number of episodes the algorithm L. Scarcello, N. Sorrentino, and M. Stillo, “A unified model for the
optimal management of electrical and thermal equipment of a prosumer
stops to explore new solutions. in a dr environment,” IEEE Transactions on Smart Grid, vol. 10, no. 2,
The scheduling furnished by the system corresponds pp. 1791–1800, 2017.
to the following array: L = {l1 , l2 , l3 , l4 , l5 , l6 } = [6] N. Amjady and M. Hemmati, “Energy price forecasting - problems and
proposals for such predictions,” IEEE Power and Energy Magazine,
{64, 66, 75, 64, 61, 69}, which, as stated in Section III-A, vol. 4, no. 2, pp. 20–29, 2006.
represents for each load i ∈ {1, . . . , 6} its activation time- [7] D. Zhang, X. Han, and C. Deng, “Review on the research and practice of
step. deep learning and reinforcement learning in smart grids,” CSEE Journal
of Power and Energy Systems, vol. 4, no. 3, pp. 362–370, 2018.
V. C ONCLUSIONS [8] W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu, “Edge computing: Vision and
challenges,” IEEE internet of things journal, vol. 3, no. 5, pp. 637–646,
The paper proposed an IoT edge-based energy management 2016.
system devoted to minimizing the energy cost for the daily-use [9] F. Cicirelli, A. Guerrieri, A. Mercuri, G. Spezzano, and A. Vinci,
“Itema: A methodological approach for cognitive edge computing iot
of in-home appliances. For this purpose, a scheduling problem ecosystems,” Future Generation Computer Systems, vol. 92, pp. 189–
formulation was first given, then reduced to a reinforcement 197, 2019.
learning problem modeled through a Markov decision pro- [10] K. Shahryari and A. Anvari-Moghaddam, “Demand side management
using the internet of energy based on fog and cloud computing,” in
cess in an agent-environment scenario. The given MDP, the 2017 IEEE International Conference on Internet of Things (iThings) and
environment, and the defined reward function were capable IEEE Green Computing and Communications (GreenCom) and IEEE
of taking into account users’ preferences in load execution Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data
(SmartData). IEEE, 2017, pp. 931–936.
deadlines along with time-variable profiles for energy cost, [11] L. Y-H and H. Y-C., “Residential consumer-centric demand-side man-
energy production, and energy consumption for each shiftable agement based on energy disaggregation-piloting constrained swarm
appliance. The realized test case confirmed the feasibility of intelligence: Towards edge computing,” Sensors, vol. 18, no. 5, p. 1365,
2018.
the proposed approach in an edge-based environment. The [12] T. Li, Y. Xiao, and L. Song, “Deep reinforcement learning based resi-
preliminary experimental results have shown the effectiveness dential demand side management with edge computing,” in 2019 IEEE
of the exploited scheduling algorithm. International Conference on Communications, Control, and Computing
Technologies for Smart Grids (SmartGridComm). IEEE, 2019, pp. 1–6.
Future work is geared at: [13] M. L. Puterman, Markov decision processes: discrete stochastic dynamic
• improving the approach by considering energy storage programming. John Wiley & Sons, 2014.
facilities; [14] F. Cicirelli, A. Guerrieri, G. Spezzano, and A. Vinci, “A Cognitive
Enabled, Edge-Computing Architecture for Future Generation IoT En-
• trying other reinforcement learning approaches, such as vironments,” in Proceeding of the IEEE 5th World Forum on Internet of
Deep Reinforcement Learning, to cope with a broader Things, Limerick, Ireland, 2019.
number of loads; [15] V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley,
D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep rein-
• extending the formulation of the scheduling problem forcement learning,” in International conference on machine learning,
by considering other aspects like load priority and the 2016, pp. 1928–1937.
possibility of selling self-produced energy and/or buying [16] F. Cicirelli, A. Guerrieri, C. Mastroianni, G. Spezzano, and A. Vinci,
“Thermal comfort management leveraging deep reinforcement learning
energy from different suppliers; and human-in-the-loop,” in Accepted for the Proc. of the 1st IEEE
• adapting the approach to a more complex smart grid International Conference on Human-Machine Systems (ICHMS2020),
context, trying to coordinate agents in different houses 2020.
[17] S. Das and D. Cook, “Designing and modeling smart environments,” in
to optimize the energy costs in a whole district; World of Wireless, Mobile and Multimedia Networks, 2006. WoWMoM
• integrating the proposed system in a test case that con- 2006. International Symposium on a, Buffalo-Niagara Falls, NY, 2006,
siders synergies with other edge-based home applications pp. 5 pp.–494.
[18] F. Cicirelli, A. Guerrieri, G. Spezzano, A. Vinci, O. Briante, A. Iera,
regarding safety, security, and thermal comfort. and G. Ruggeri, “Edge computing and social internet of things for
large-scale smart environments development,” IEEE Internet of Things
ACKNOWLEDGMENT Journal, no. 99, 2017.
This work has been partially supported by the “COGITO” [19] A. K. Noor, “Potential of cognitive computing and cognitive systems,”
Open Engineering, vol. 5, no. 1, 2015.
project, funded by the Italian Government (ARS01 00836), [20] M. Wooldridge, An introduction to multiagent systems. John Wiley &
and by the “GLAMOUR” project, funded by POR Calabria Sons, 2009.
FESR-FSE, Italy (CUP: J28C17000250006). [21] N. Chauhan, N. Choudhary, and K. George, “A comparison of reinforce-
ment learning based approaches to appliance scheduling,” in 2016 2nd
R EFERENCES International Conference on Contemporary Computing and Informatics
(IC3I), 2016, pp. 253–258.
[1] P. Palensky and D. Dietrich, “Demand side management: Demand re- [22] T. Remani, E. Jasmin, and T. I. Ahamed, “Residential load scheduling
sponse, intelligent energy systems, and smart loads,” IEEE Transactions with renewable generation in the smart grid: A reinforcement learning
on Industrial Informatics, vol. 7, no. 3, pp. 381–388, Aug 2011. approach,” IEEE Systems Journal, vol. 13, no. 3, pp. 3283–3294, 2018.
[2] F. Fioretto, W. Yeoh, and E. Pontelli, “A multiagent system approach [23] D. O’Neill, M. Levorato, A. Goldsmith, and U. Mitra, “Residential
to scheduling devices in smart homes,” in Workshops at the Thirty-First demand response using reinforcement learning,” in 2010 First IEEE
AAAI Conference on Artificial Intelligence, 2017. international conference on smart grid communications. IEEE, 2010,
[3] J. Ploennigs, A. Ba, and M. Barry, “Materializing the promises of pp. 409–414.
cognitive iot: How cognitive buildings are shaping the way,” IEEE [24] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction.
Internet of Things Journal, vol. 5, no. 4, pp. 2367–2374, 2017. MIT Press, Cambridge, MA, USA, 2011.
[4] F. Cicirelli, G. Fortino, A. Guerrieri, G. Spezzano, and A. Vinci,
“Metamodeling of smart environments: from design to implementation,”
Advanced Engineering Informatics, vol. 33, pp. 274–284, 2017.

162
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

A Cost-effective Scheduling Control for a Safety


Critical Hybrid Power System
Jalil Boudjadar Mohammad Hassan Khooban
Department of Engineering Department of Engineering
Aarhus University Denmark Aarhus University Denmark
jalil@eng.au.dk khooban@eng.au.dk

Abstract—In this paper, we propose a safety-driven cost requests from propulsion motors so that the operation cost
effective scheduling controller to arbitrate and operate the is reduced as low as the safety permits. Runtime scheduling
energy resources of a maritime hybrid energy application. The decisions are made with respect to the actual system state
proposed control algorithm enables efficient energy management
to dynamically schedule the energy sources to supply the real- such as devices temperature and available energy reserves.
time power requests so that 1) we maintain the system safety To deliver an absolute guarantee about the system safety, we
by not overloading or heating up an energy source; 2) reduce examine our runtime scheduler using Uppaal model checker.
the operation cost by considering the cheapest energy source in
a real-time manner. The efficiency and safety of our scheduling II. S YSTEM A RCHITECTURE
algorithm have been examined using Uppaal model checker. The
experiment outputs show that our controller maintains the system In general, a zero-emission ferry power system with fuel
safety and guarantees the lowest operation cost. cells (as a main source of the ship power), power electronic
Index Terms—Safety control, real-time scheduling, hybrid- devices (as interfaces for renewable energy systems) and loads
energy systems, formal verification. (like ship motor(s) and navigation system(s)) can be consid-
ered as a special mobile islanded DC microgrid [12]. The
I. I NTRODUCTION
hybrid energy system we consider is formed by a composition
Fuel cell (FC) is a green technology to generate energy two subsystems: FC-based and battery-based, as depicted in
from hydrogen. It usually paired with reversible energy storage Fig. 1. The fuel cell-based subsystem consists of three Proton-
devices such as batteries and ultracapacitors to create hybrid Exchange Membrane Fuel Cell units (shortly FC) having
electric power supply solutions, shortly known by FC-hybrid different capacity (300kwh, 300kwh, 100kwh). These units
power systems [11]. FC-based systems have very strict safety can operate collaboratively to inject energy in the power bus
requirements due to the existence of hazardous phenomena depending on the load request and the subsystem configuration
such explosion of hydrogen tanks and batteries, melting of such as temperature, operation cost, etc. The battery-based
fuel cells, storage risks, etc. The complexity and criticality of system is formed by two lithium batteries (200kwh each)
such systems make the underlying real-time control complex operating in similar way to FC units, however batteries can
as it has to consider different constraints related to safety also extract energy from the power bus for self recharging. The
and reliability [1], [2], operation cost [24] and performance two energy subsystems have different safety and performance
[6]. To comply with the safety requirements imposed by characteristics as well as operation cost.
the international regulations and standards, FC-hybrid energy The control architecture of our HFC system is given as a
systems are subject to a rigorous and absolute guarantee hierarchical scheduling system [8]. An individual controller
verification and validation prior to deployment [24]. is dedicated to operate each of the two energy subsystems
The safety and energy efficiency of FC-based systems have locally (FC-Ctrl, B-Ctrl). Moreover, a top level controller
been studied for different transport applications [6], [16], [21] (H-Ctrl) is considered as the main energy management
such as road vehicles [3], buses [18] and trams [25], however system coordinating the different subsystems. Each subsystem
only few attempts consider both metrics (safety, operation can interact via 2 ways: send alarm events to the main
cost) together when designing controllers for FC-hybrid ap- controller and receive commands from the main controller
plications. Making safety as the main driving property in the through command bus, or inject energy to the DC network.
design of safety-critical systems may lead to expensive oper- The battery-based system can also receive energy from the
ation cost and inefficient utilization of the energy resources. DC network.
This paper introduces a safe controller for cost-effective real- The top level controller H-Ctrl receives power requests
time scheduling of the energy sources of a hybrid fuel cell- from the load (propulsion engines) as signals via port R,
based power system (HFC). real-time cost for the operation of both battery and FC via
The proposed real-time controller defines a compromise be- port C, and the internal configuration of each subsystem such
tween safety and operation cost to supply the real-time energy as the amount of energy left in each of the storage units and
978-1-7281-7343-6/20/$31.00
c 2020 IEEE

163
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

This might lead to safety violation of the mostly active energy


supply units.
The authors of [15] studied the scheduling of energy supply
for FC hybrid electric vehicles with the objective to minimize
the overall cost of hydrogen consumption. The collaborative
scheduling of the energy sources was performed by consider-
ing availability, performance and state of charge constraints.
Our paper is a design and validation of an efficient real-
time scheduler, as a decision making process, of the energy
sources of the hybrid fuel cell-based energy (HFC) system. It
focuses on finding a compromise between safety, performance
and cost so that one maintains the system safe while reducing
the operation cost is reduced as low as the safety permits.
IV. C OST- EFFECTIVE S CHEDULING C ONTROL
The runtime scheduler we propose runs every time the con-
troller H-Ctrl receives a new energy request from propulsion
Fig. 1. Physical setting of a marine HFC motors, or an alarm from one of the subsystems. The alarm
signals can be:
• A FC/battery unit heating up to a threshold.
temperature of the different energy sources. Based on the
• Hydrogen tank of the FC-based systems running out.
different real-time inputs, H-Ctrl calculates a scheduling
• Battery unit reaching a maximum storage level.
decision and mediates the control to the controller of the
• Battery reaching the minimum storage level.
energy subsystems identified to be suitable to deliver the
requested power, either FC-Ctrl and/or B-Ctrl. The local We use R = he, [t, t0 ]i to be the energy supply rate e and the
controller decision depends again on the internal state of each time interval [t, t0 ] for which the supply holds. To satisfy a
energy unit, and the amount of energy to deliver. The local power request, H Ctrl controller checks first the availability
controllers can issue signals to the top level controller so that of the requested energy in both subsystems. If such an energy
a scheduling decision is recalculated. amount is available in both resources, a safety check is
performed to make sure that if a subsystem is used to deliver
III. R ELATED W ORK the requested energy it will not violate its safety properties
such as maximum temperature and minimum reserve. If both
The safety and energy scheduling efficiency of FC-based subsystems are able to safely satisfy such a request, a cost
systems have been studied for different application areas [17], analysis is carried out to choose the less expensive subsystem.
[6], [5], [16], [21] such as road vehicles [3], buses [18] and In case neither of the subsystems is able to satisfy the power
trams [25]. However, only few attempts focused on the safety request individually, a collaborative scheduling [19] is used
analysis of FC for marine applications [22], [20]. where both subsystems contribute with different percentages.
The authors of [5], [6] introduce a scheduler for the FC-
based hybrid vehicle power systems. The control algorithm A. Energy Reserve Availability Check
is coupled with an optimization process to find the efficient This analysis stage checks whether the remaining energy
injection rate so that minimizing the gap between FC supply reserve of a subsystem is sufficient to satisfy a given power re-
and the load demand. However, the main driving optimization quest. To such an end, for each demand we perform an analysis
factor relies on performance only which may jeopardize the of the post-supply state to be obtained if the demand request
system safety. Compared to that, our energy scheduling algo- is satisfied [7]. Given a power request R = he, [t, t0 ]i, we first
rithm guarantees cost optimal operation while maintaining the calculate the hydrogen volume v(R) needed to produce the
system safety. energy requested in R. If the required hydrogen volume v(R)
The authors of [14], [13] propose a scheduling process for is larger than the actual hydrogen volume H(t), at time instant
multi-stack FC systems to improve the reliability and lifetime t when R is issued, then FC-based subsystem is a candidate
.
of individual FC units. A fuel cell is chosen to supply the load to satisfy R, i.e. Candidat(F C Ctrl, R) = H(t) > v(R).
energy if the cost related to its activity is the smallest possible. Similarly, the function Candidate() for the battery-based
Making cost as the decisive factor may overload the cheap subsystem is calculated as follows:
resource units, thus contributes in the temperature increase . X
and violates safety. Candidat(B Ctrl, R) = Reserve(Bj , t) > e ∗ (t0 − t)
j
Bigdeli et al [4] propose a scheduling technique for a FC
hybrid electric power generation system. The collaborative Where Reserve(Bj , t) is the energy level available in battery
scheduling, where different energy sources operate simulta- unit Bj at time point t. This in fact corresponds to the state
neously, is achieved using performance as the main criterion. of charge (SoC). The local controller of the battery subsystem

2
164
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

checks whether there is a battery unit that is able to provide D. Collaborative Scheduling
the totality of the requested amount by comparing the state of As mentioned earlier, if none of the energy subsystems
charge and draining that the request would make. If neither of passes successfully the availability and safety check to serve
the two battery units is able to provide the requested energy a power request individually we consider a collaborative
individually, we consider collaborative scheduling [10], [19] scheduling mode where both energy subsystems contribute
where both battery units supply together two sub-requests, with different rates. The individual contributions of each
respectively R1 and R2 such that e = e1 + e2 . subsystem needs to be approved for availability and safety.
B. Safety Check Collaborative scheduling [10] is achieved by a binary pro-
cess carried out in an iterative way following Dichotomy
This process analyzes whether a candidate energy unit
method [23]. For a given power request R, the start point of the
would violate its maximum temperature in case it is used
iterative process consists in checking whether each subsystem
to satisfy an energy request. To such an end, we calculate
is able to supply R/2. By ability we mean that a subsystem
what would be the temperature after satisfying a request. If
has enough reserve to serve a given request safely. In case
the the maximum temperature of a candidate energy source
one of the subsystems fails the ability check, we reduce its
unit is not violated then such a device will proceed further
contribution by half, i.e. to R/4, and so on until we find a
for the cost analysis, otherwise the energy source unit will be
contribution rate x the given subsystem is able to provide
discarded for supplying the totality of the requested energy
while the other subsystem is able to provide R − x safely.
and a collaborative scheduling has to be considered. Formally,
Otherwise, when both subsystems are able each to provide
we define a function Saf e(Fi , R) to check the safety of a FC
R/2 the iterative process considers increasing the contribution
unit as follows:
. of the resource having cheaper cost. We keep increasing the
Saf e(Fi , R) = T emp(Fi , t0 ) < M ax T empi . Accordingly,
contribution x of the cheap subsystem on each iteration, by
the fuel cell based subsystem is safe if and only if:
. 50% of the value added on the last iteration, until it becomes
Saf e(F C Ctrl,P R) = ∃i | Saf e(Fi , R) Or∀i Saf e(Fi , Ri ).
not able to secure the contribution rate x. The final value of x
such that R = i Ri . Similarly, we define the temperature
is most likely the contribution rate leading to optimal operation
safety property of a battery unit regarding a power request as
. cost. An abstraction of the algorithm is depicted in Algorithm
follows: Saf e(Bj , R) = T emp(Bj , t0 ) < M ax T empj .
1.
In case a battery unit is not able to safely satisfy a power
request R we consider a collaborative supply where both E. Safety and Operation Cost Analysis
battery units contribute. We define accordingly the safety of
the battery-based subsystem as follows: To perform a formal verification of the safety properties,
Saf e(B Ctrl, R) = ∃ j | Saf e(Bj , R)Or Saf e(B1 , R1 ) ∧ we mechanize the system model in Uppaal. In fact, we
Saf e(B2 , R2 ) where R = R1 + R2 . use symbolic model checking to examine safety properties
The contribution percentage of the two battery units, when whereas statistical model checking (SMC) is used for quan-
operating under collaborative mode, is calculated as stated in titative analysis [9]. An example of a safety property is
.
Algorithm 1. The identified percentages must satisfy both S1 = ∀ i t T emp(Fi , t) ≤ M ax T empi . In fact, each
safety and availability check. In case both subsystems succeed safety property is examined using Uppaal as follows: ∀[] Si .
in the safety check, we consider the related cost operation. In a similar way, the following Uppaal SMC queries are used
respectively to perform quantitative analysis of the batteries
C. Cost Calculation SoC and operation cost:
The cost analysis step amounts to calculate the operation E[time≤1e6; 1000] (max:SoCj ); E[time≤1e6;
cost of each energy subsystem when satisfying a given energy 1000] (max:cost). Each query specifies that SMC runs
request, using real time energy rates. The operation cost is sim- 1000 simulations, each of which last for one million time
ply obtained by multiplying the the total resource amount to be unit.
supplied by the respective energy unit price. We define the cost
V. C ONCLUSION
of FC Ctrl system to satisfy a request R = he, [t, t0 ]i by the
volume of hydrogen to consume multiplied by the unit price of This paper presented a safety-driven cost-effective controller
hydrogen U (H2 , t): Cost(F C Ctrl, R) = v(R) ∗ U (H2 , t). to schedule the energy sources of a hybrid energy solution. The
In a similar way, we calculate the budget of using battery- solution considered is formed by a set of batteries and fuel cell
based system by the amount of energy to inject multiplied by units to power a maritime application. The proposed algorithm
the actual unit price of charging U (E, t): Cost(B Ctrl, R) = operates the energy supply units following the real-time energy
drain(B1 , R) ∗ U (E, t) demand, safety constraints and operation cost.
Since our battery units are identical, we use the unit price To deliver an absolute guarantee about the system safety, we
of battery B1 no matter of which battery unit is actually being examine our runtime scheduler using Uppaal model checker.
selected by B Ctrl. In case both energy subsystems have the As a future work, we plan to investigate optimal decision
same operation cost to satisfy a power request, we prioritize to making strategies that are able to follow the HFC system
use the FC-based subsystem due to its fast tanking operation. dynamics in real-time.

3
165
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Algorithm 1 Sketch of the Scheduling Algorithm [4] N. Bigdeli. Optimal management of hybrid pv/fuel cell/battery power
foreach NewEvent X do system: A comparison of optimal hybrid approaches. Renewable and
Qheq , [t, tq ]i = Lef t Last Request(); Sustainable Energy Reviews, 42:377 – 393, 2015.
[5] N. Bizon. Energy optimization of fuel cell system by using global
if X = R(he, [t, t0 ]i) then
extremum seeking algorithm. Applied Energy, 206:458 – 474, 2017.
Scheduler(R); [6] N. Bizon. Real-time optimization strategy for fuel cell hybrid power
end sources with load-following control of the fuel or air flow. Energy
if X = M ax T emp(Fi ) then Conversion and Management, 157:13 – 27, 2018.
Cooling(Fi ); [7] J.-P. Bodeveix, A. Boudjadar, and M. Filali. An alternative definition for
Scheduler(heq + ec , [t, tq ]i) timed automata composition. In Automated Technology for Verification
end and Analysis, pages 105–119, Berlin, Heidelberg, 2011. Springer Berlin
if X = M ax T emp(Bj ) or X = M in H V olume() then Heidelberg.
Scheduler(Q); [8] A. Boudjadar, A. David, J. H. Kim, K. G. Larsen, M. Mikucionis,
end U. Nyman, and A. Skou. Hierarchical scheduling framework based on
if X = M in SoC(Bj ) then compositional analysis using uppaal. In 10th International Symposium
on Formal Aspects of Component Software - Volume 8348, FACS 2013,
Charge(Bj , hem in, [t, tq ]i);
page 6178. Springer-Verlag, 2013.
Scheduler(heq + em in, [t, tq ]i); [9] J. Boudjadar, A. David, J. Kim, K. Larsen, U. Nyman, and A. Skou.
end Schedulability and energy efficiency for multi-core hierarchical schedul-
if X = M ax SoC(Bj ) then ing systems. In European Congress on Embedded Real Time Systems,
Scheduler(heq − echarge , [t, tq ]i); 2014.
end [10] J. Boudjadar, S. Ramanathan, A. Easwaran, and U. Nyman. Combining
end task-level and system-level scheduling modes for mixed criticality
Scheduler(Z) systems. In 23rd IEEE/ACM International Symposium on Distributed
if Candidat(F C Ctrl, Z) and Candidat(B Ctrl, Z) then Simulation and Real Time Applications, DS-RT 2019, 2019.
if Saf e(F C Ctrl, Z) and Saf e(B Ctrl, Z) then [11] V. Das, S. Padmanaban, K. Venkitusamy, R. Selvamuthukumaran,
F. Blaabjerg, and P. Siano. Recent advances and challenges of fuel
if Cost(F C Ctrl, Z) ≤ Cost(B Ctrl, Z) then
cell based power system architectures and control a review. Renewable
Operate(F C Ctrl); and Sustainable Energy Reviews, 73:10 – 18, 2017.
Return(True); [12] M. Gheisarnejad, H. Mohammadi-Moghadam, J. Boudjadar, and M. H.
end Khooban. Active power sharing and frequency recovery control in an
else islanded microgrid with nonlinear load and nondispatchable dg. IEEE
Operate(B Ctrl); Systems Journal, 14(1):1058–1068, 2020.
Return(True); [13] N. Herr, J. Nicod, and C. Varnier. Prognostics-based scheduling in
end a distributed platform: Model, complexity and resolution. In 2014
end IEEE International Conference on Automation Science and Engineering
else (CASE), pages 1054–1059, 2014.
Case Saf e(F C Ctrl, Z) : Operate(F C Ctrl); [14] N. Herr, J.-M. Nicod, C. Varnier, L. Jardin, A. Sorrentino, D. Hissel,
Case Saf e(B Ctrl, Z) : Operate(B Ctrl); and M.-C. Pra. Decision process to manage useful life of multi-stacks
Default: CollaborativeSched(Z); fuel cell systems under service constraint. Renewable Energy, 105:590
end – 600, 2017.
[15] A. Neffati, M. Guemri, S. Caux, and M. Fadel. Energy management
end
strategies for multi source systems. Electric Power Systems Research,
else 102:42 – 49, 2013.
Case Candidate(F C Ctrl, Z) and Saf e(F C Ctrl, Z) : [16] F. Odeim, J. Roes, and A. Heinzel. Power management optimization of
Operate(F C Ctrl); an experimental fuel cell/battery/supercapacitor hybrid system. Energies,
Case Candidate(B Ctrl, Z) and Saf e(B Ctrl, Z) : 8(7):6302–6327, 2015.
Operate(B Ctrl); ; [17] L. Olatomiwa, S. Mekhilef, M. Ismail, and M. Moghavvemi. Energy
Default: CollaborativeSched(Z); management strategies in hybrid renewable energy systems: A review.
end Renewable and Sustainable Energy Reviews, 62:821 – 835, 2016.
CollaborativeSched(Z) [18] J. Peng, H. He, and R. Xiong. Rule based energy management strategy
while False do for a seriesparallel plug-in hybrid electric bus optimized by dynamic
x=Z; if Sched(x/2) ∧ Sched(x/2) then programming. Applied Energy, 185:1633 – 1643, 2017. Clean, Efficient
True; and Affordable Energy for a Sustainable Future.
[19] A. Toor, S. ul Islam, N. Sohail, A. Akhunzada, J. Boudjadar, H. A.
end Khattak, I. U. Din, and J. J. Rodrigues. Energy and performance aware
else fog computing: A case of dvfs and green renewable energy. Future
Sched(x + x/2) ∧ Sched(x − x/2); Generation Computer Systems, 101:1112 – 1121, 2019.
x 7→ x/2; [20] T. Tronstad, H. H. Astrand, G. P. Haugom, and L. Langfeldt. Study
end on the use of fuel cells in shipping. EMSA European Maritime Safety
end Agency, pages 1 – 108, 2017.
[21] N. Vafamand, M. H. Khooban, T. Dragievi, J. Boudjadar, and M. H.
Asemani. Time-delayed stabilizing secondary load frequency control of
R EFERENCES shipboard microgrids. IEEE Systems Journal, 13(3):3233–3241, 2019.
[1] P. Aguiar, C. Adjiman, and N. Brandon. Anode-supported intermediate [22] L. van Biert, M. Godjevac, K. Visser, and P. Aravind. A review of
temperature direct internal reforming solid oxide fuel cell. i: model- fuel cell systems for maritime applications. Journal of Power Sources,
based steady-state performance. Journal of Power Sources, 138(1):120 327:345 – 364, 2016.
– 136, 2004. [23] T. Villa, T. Kam, R. K. Brayton, and A. L. Sangiovanni-Vincentelli.
[2] S. Alkaner and P. Zhou. A comparative study on life cycle analysis Synthesis of Finite State Machines: Logic Optimization. 2011.
of molten carbon fuel cells and diesel engines for marine application. [24] J. Wang. Barriers of scaling-up fuel cells: Cost, durability and reliability.
Journal of Power Sources, 158(1):188 – 199, 2006. Energy, 80:509 – 521, 2015.
[3] W. Andari, S. Ghozzi, H. Allagui, and A. Mami. Optimization of [25] W. Zhang, J. Li, L. Xu, and M. Ouyang. Optimization for a fuel
hydrogen consumption for fuel cell hybrid vehicle. Indian Journal of cell/battery/capacity tram with equivalent consumption minimization
Science and Technology, 11(2), 2018. strategy. Energy Conversion and Management, 134:59 – 69, 2017.

4
166
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

CoSim: A Simulator for Co-Scheduling of Batch


and On-Demand Jobs in HPC Datacenters
Avinash Maurya∗ , Bogdan Nicolae† , Ishan Guliani∗ , M. Mustafa Rafique∗
∗ Rochester
Institute of Technology, USA
† Argonne
National Laboratory, USA
Email: ∗ {am6429, ig5859, mrafique}@cs.rit.edu; † bnicolae@anl.gov

Abstract—The increasing scale and complexity of scientific computations (e.g., under the right circumstances, expensive
applications are rapidly transforming the ecosystem of tools, steps of an HPC simulation can be replaced with faster deep
methods, and workflows adopted by the high-performance com- learning predictions), guided ensemble searches (e.g., when
puting (HPC) community. Big data analytics and deep learning
are gaining traction as essential components in this ecosystem in running a set of simulations to find a molecule that docks to
a variety of scenarios, such as, steering of experimental instru- a protein, deep learning can be used to predict the next most
ments, acceleration of high-fidelity simulations through surrogate promising simulations to try next).
computations, and guided ensemble searches. In this context, the These scenarios require running opportunistic on-demand
batch job model traditionally adopted by the supercomputing in- jobs when certain conditions are triggered, e.g., an analytics
frastructures needs to be complemented with support to schedule
opportunistic on-demand analytics jobs, leading to the problem of job that looks for anomalies in the experimental data collected
efficient preemption of batch jobs with minimum loss of progress. by the instrument, a deep learning training and/or inference.
In this paper, we design and implement a simulator, CoSim, These jobs need to start within a given deadline, often in the
that enables on-the-fly analysis of the trade-offs arising between order of minutes. Failure to start them by the given deadline
delaying the start of opportunistic on-demand jobs, which leads leads to a missed opportunity (e.g., it’s too late to calibrate
to longer analytics latency, and loss of progress due to preemption
of batch jobs, which is necessary to make room for on-demand the instrument) and/or incur a performance penalty (e.g., idle
jobs. To this end, we propose an algorithm based on dynamic simulations that wait for the next deep learning prediction or
programming with predictable performance and scalability that otherwise take alternative suboptimal decisions). On the other
enables supercomputing infrastructure schedulers to analyze the hand, HPC datacenters traditionally adopt a batch job schedul-
aforementioned trade-off and take decisions in near real-time. ing model where users request compute and accelerator (e.g.,
Compared with other state-of-art approaches using traces of
the Theta pre-Exascale machine, our approach is capable of GPU) resources of the datacenters for the required amount of
finding the optimal solution, while achieving high performance time (wall time), while the scheduler decides when to run each
and scalability. job based on various trade-offs such as the need to maximize
Index Terms—High-performance computing, batch job pre- the utilization of machines, and job priority. Popular HPC
emption, job checkpointing datacenter schedulers, e.g., SLURM (Simple Linux Utility
I. I NTRODUCTION Resource Manager) [6], COBALT [7], and TORQUE (Tera-
scale Open-source Resource and QUEue manager) [8], cannot
Big data analytics and deep learning are rapidly gaining
co-schedule batch jobs with opportunistic on-demand jobs.
traction both in the industry and scientific computing. A key
A naive solution to address this problem could simply
driver for this trend has been the unprecedented accumulation
reserve a set of nodes for on-demand jobs and use the rest of
of big data, which exposes plentiful learning opportunities
the nodes for batch jobs. Although applied in practice, such a
thanks to its massive size and variety. Unsurprisingly, there has
solution is not desired as it is hard to predict how many nodes
been a significant interest to adopt deep learning at a very large
are needed by the on-demand jobs. Using too few nodes for
scale on supercomputing infrastructures in a wide range of
on-demand jobs leads to missed opportunities, whereas, using
scientific areas, e.g., fusion energy science [1], computational
too many nodes leads to idle nodes and slow progress of the
fluid dynamics [2], lattice quantum chromodynamics [3], vir-
batch jobs. Furthermore, even if such predictions were perfect,
tual drug response prediction [4], and cancer research [5].
One of the main use cases of big data analytics and deep there may be significant fluctuations in datacenter utilization
learning in scientific computing is to use them as a tool to patterns that make it hard to dynamically move the nodes back
complement high-performance computing (HPC) simulations and forth between on-demand and batch queues. At the other
running on supercomputing infrastructures in a variety of extreme, an alternative naive solution could simply use all
scenarios: steering of experimental instruments (e.g., calibrate nodes for batch jobs and start killing batch jobs to make room
scientific instruments in real-time to correct anomalies in for on-demand jobs when needed. This solution does not lead
experimental data and/or refocus dynamically on areas of inter- to missed opportunities but may incur significant overhead on
est), acceleration of high-fidelity simulations through surrogate the batch jobs due to loss of progress.
In this paper, we propose an alternative solution to address
these challenges that relies on checkpointing for suspending
978-1-7281-7343-6/20/$31.00 ©2020 IEEE and resuming batch jobs, if required, to make room for time-

167
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

sensitive on-demand jobs, thereby minimizing the amount of then prefer the subset for which the checkpointing overhead
lost progress by the batch jobs. To this end, we introduce is minimized. We refer to this as the eviction problem.
CoSim, a simulation framework that aims to identify the opti- Using the subsets Si and corresponding strategy, an HPC
mal combination of jobs that should be either checkpointed or datacenter scheduler can simulate the outcome of multiple
killed to free a fixed number of nodes that are required to run hypothetical scenarios with a variable deadline i corresponding
the on-demand job. Unlike other approaches, CoSim simulates to the trade-off between maximizing the value of the on-
all outcomes resulting from a variable deadline up to the given demand jobs and minimizing the loss of the batch jobs.
maximum in a single pass, thereby eliminating the need to We note that while we formulate the problem of HPC dat-
run separate simulations for each fixed deadline. Using this acenters, a similar formulation can be done for opportunistic
approach, the scheduler can make more informed decisions jobs in cloud computing architectures where there is an upper
by considering the various trade-offs arising from delaying bound on elasticity, e.g., the user cannot afford to run on
the start of the on-demand jobs and losing progress on the more than a fixed amount of virtual machines (VMs) at a
batch jobs. Specifically, we make the following contributions time and must evict existing jobs if necessary. Without loss of
in this paper: generality, CoSim can be applied in such scenarios as well.
• We formulate the problem statement, introducing a series
III. D ESIGN P RINCIPLES AND A PPROACH
of assumptions and general considerations for simulating
the outcomes of checkpointing batch jobs to vacate nodes This section introduces the high-level design principles of
for running on-demand jobs (Section II). our proposed approach and explains aspects related to the
• We introduce a series of design principles and an algo- checkpointing model and exploration algorithm that implement
rithm based on the dynamic programming to find the op- these design principles.
timal combination of batch jobs that incurs the minimum
A. Design principles
loss of progress while satisfying the deadline of the on-
demand jobs. Our algorithm produces an optimal solution CoSim is based on the following design principles:
for every possible deadline up to a given maximum in a 1) Mix of system-level and application-level checkpointing:
single pass (Section III). We differentiate between system-level and application-level
• We evaluate our approach in a series of experiments using checkpointing, because they present an interesting trade-off:
three scenarios extracted from the batch job traces of system-level checkpointing techniques, such as DMTCP [9],
Argonne’s Theta pre-Exascale machine. We compare our are application-agnostic and can be performed at any moment
approach with an exhaustive search based on backtrack- t0 . However, they involve large checkpoint sizes because the
ing and a greedy approach. The results show significant entire memory space of all application processes needs to be
performance and scalability improvement as compared persisted to a stable storage, e.g., a parallel file system (PFS).
to backtracking, as well as a significant improvement in Therefore, system-level checkpointing may take a long time to
the quality of the solution compared as compared to the complete. On the other hand, application-level checkpointing
greedy approach (Section IV). is typically performed by HPC applications regularly using
either a custom solution or a checkpointing library, such as,
II. P ROBLEM F ORMULATION VELOC [10]. In this case, the checkpoint size is smaller as
The problem of co-scheduling batch jobs with opportunistic each application process needs to save only the critical data
on-demand jobs in an HPC datacenter can be formulated structures needed for a restart and therefore faster to write to
as follows. Let’s assume that N batch jobs are running at the stable storage. However, it is necessary to wait for the
time t0 , and each of these jobs is characterized by the tuple application to reach a moment t1 > t0 when it is safe to
hid, jn, loss, tckpti, where id is a unique identifier of the checkpoint. Depending on how far away t1 is from t0 and
job, jn is the number of compute nodes the batch job is how much larger a system-level checkpoint is compared with
running on, loss quantifies the amount of lost progress if an application-level checkpoint, one or the other may be faster.
the job is killed (e.g., node-hours since the last checkpoint Furthermore, it is important to note that even if the system
or since the beginning if no checkpoint was taken), tckpt is and application-level checkpointing overheads are equal, it is
the time required to checkpoint the job to successfully suspend still important to choose the application-level checkpoint over
its execution without loss of progress. the system-level checkpoint, because using application-level
Given K nodes that need to be released not later than t0 +T checkpoint enables the batch job to make additional progress
(for the purpose of starting opportunistic on-demand jobs), the during the interval (t0 , t1 ). We incorporate such considerations
goal is to find all optimal subsets of batch jobs Si ⊂ N and in our simulator.
corresponding killing or checkpointing strategy for all t0 < 2) Simultaneous exploration of the full on-demand deadline
i < t0 + T . A subset Si is optimal if it satisfies the following range: As discussed in Section II, our goal is to solve the
properties simultaneously: (1) at least K nodes are released eviction problem for all deadlines in the range (t0 , t0 + T ),
by the deadline t0 +i; (2) the accumulated loss of work due to because the scheduler needs to consider the trade-off between
killing jobs is minimized; (3) if there are multiple subsets Si delaying the on-demand jobs, which may lead to lower quality
for which the accumulated loss due to job kills is minimized, of the results due to slow reaction time, and losing progress

168
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

of the batch jobs. Thus, a naive strategy would be to iterate checkpointing duration is ta = tckpt + delay, where delay is
over all deadlines i in range (t0 , t0 + T ) and solve the the difference between the next scheduled checkpoint and t0 .
problem independently for each i. However, such a strategy Since system-level checkpointing can be performed instantly
is sub-optimal, because the problems resulting from fixing at t0 , its duration, ts, will be equal to tckpt, i.e., ts = tckpt.
all deadlines i in range (t0 , t0 + T ) have identical inputs However, the two approaches will have a different checkpoint
except for the deadline i, therefore they may be decomposed size on each node, resulting in the trade-off that is explained
into sub-problems that are shared across several instances and in Section III-A.
thus need to be solved only once. Our approach leverages In a typical HPC datacenter, the size of each batch job is
this observation to construct an algorithm based on dynamic usually large enough to saturate the aggregated I/O bandwidth
programming that is capable of taking advantage of such of the PFS. Therefore, we consider a simple checkpointing
decompositions to solve all deadlines in a single pass. This model where the jobs are checkpointed serially. In this case,
algorithm is explained in Section III-C. the total time required for checkpointing a set of batch jobs
3) Polynomial response time: A key requirement for ex- is the sum of their corresponding ta or ts. In fact, under
ploring the full on-demand deadline range is to ensure fast such circumstances, checkpointing multiple batch jobs in
response time so that the scheduler can decide quickly, prefer- parallel would perform worse than checkpointing the batch
ably at moment t0 , about the jobs that must be checkpointed jobs serially, because of over-subscribing the aggregated I/O
to the PFS to run the incoming on-demand jobs. Thus, an bandwidth of the PFS. Nevertheless, we note that our model
algorithm that is not polynomial in any variable, such as, the can be further refined to simulate concurrent checkpointing of
number of jobs N , maximum deadline T , or the number of the batch jobs in the case of small jobs that do not saturate
nodes to be released K, will lead to unacceptable response the aggregated I/O bandwidth of the PFS.
time, considering that modern HPC datacenters routinely run
several batch jobs simultaneously and may need to release C. Exploration algorithm
a large number of nodes for on-demand jobs to accommo-
date bursts of opportunistic events. Therefore, our proposed In this section we propose a dynamic programming algo-
solution is designed to satisfy such constraints, and delivers rithm based on the aforementioned design principles.
response times in the order of milliseconds or less. The key observation that inspires our algorithm is the fact
that the eviction problem is related to the discrete backpack
B. Loss and checkpointing model problem: given N items, where Wi and Vi represent the weight
We estimate the loss incurred by killing a batch job as and value of the ith item, fill a backpack that can carry a
the number of node-hours that have elapsed since its last maximum weight K such that the combined value of all items
application-level checkpoint until t0 , the moment when the is maximized without overflowing K. By analogy, we can
nodes need to be evicted to make room for the on-demand consider the batch jobs as items and the backpack as the set of
jobs. This is based on a configurable interval that can be nodes where the jobs are running. This problem has a simple
independently adjusted for each job in our simulator. In dynamic programming decomposition: maximum value for N
practice, the interval is fixed based on empirical observations, items is the greater of: (1) the maximum value for N − 1
e.g., every hour, because the checkpoints are used both to items and capacity K (excludes item N ); (2) VN plus the
survive failures and to record intermediate results. However, if maximum value obtained for N −1 items and capacity K−WN
checkpoints are only used for fault tolerance, then an optimal (includes item N ). By solving this decomposition recursively
checkpointing interval can be computed [11]. and applying memoization techniques, a runtime complexity
In order to simulate alternatives that mix application-level of O(N · K) can be achieved. Note that this decomposition
checkpointing with system-level checkpointing, we consider solves the problem not only for a backpack of capacity K,
the time to checkpoint each batch job: but at the same time for all backpacks of capacity i such that
jn
X 0 < i ≤ K.
tckpt = max( sckpt(i)/Ba , maxjni=1 (sckpt(i)/Bc )) (1)
Starting from this observation, we adopt a similar strategy
i=1 but with two important differences. First, we need to free at
where jn is the number of nodes occupied by the batch least K nodes, which means that the optimal solution may
job, Ba is the aggregated I/O bandwidth of the PFS, Bc is the involve more than K nodes. Therefore, we need to consider
maximum I/O bandwidth of a compute node, and sckpt(i) is up to M nodes, where M is the number of nodes occupied by
the size of the checkpoint on node i ∈ [1 . . . jn]. The intuition all batch jobs at the moment t0 . Second, the eviction problem
behind this is that the checkpointing time is bounded either by introduces a new dimension in the decomposition, i.e., the
the maximum aggregated bandwidth of the PFS or the slowest deadline T to start the on-demand jobs. Specifically, it is not
node (if the nodes do not consume the maximum aggregated enough to release at least K nodes within a deadline T when
bandwidth). considering N − 1 batch jobs and then try for the N th batch
In the case of application-level checkpointing, we must job all four alternatives, i.e., ignore, kill, application-level
wait for the next checkpoint to happen, which introduces a checkpoint, and system-level checkpoint, because the optimal
delay in addition to tckpt. Therefore, the application-level solution for N − 1 batch jobs may get close to the deadline

169
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Algorithm 1: Dynamic programming algorithm to free the system-level checkpointing during the updates, and thus
K nodes within a range of deadlines [0 . . . T ] with becomes a preferred choice in the case of equal loss and
minimal loss of compute progress. checkpointing time.
Input: List J of N batch jobs running at t0 , K, T Another important observation is that Algorithm 1 solves
Output: List of job eviction strategies Si , 0 < i < T the eviction problem not only for at least K nodes, but the
1 a[0, 0] ← 0 entire node spectrum [0 . . . M ]. This enables the scheduler to
2 u[0, 0] ← ∅ consider more advanced trade-offs for on-demand jobs, such
3 M ←0
4 for (id, jn, loss, ts, ta) ∈ J do as running the on-demand jobs with more or less than K
5 M ← M + jn requested nodes, which can be used to dynamically adjust
6 for (id, jn, loss, ts, ta) ∈ J do the latency and/or the quality of the on-demand results. Such
7 b←a trade-offs can be incorporated at no additional simulation cost
8 v←u using our proposed approach.
9 for (n, t) ∈ a do
10 if b[n + jn, t] > a[n, t] + loss then IV. P ERFORMANCE E VALUATION
11 b[n + jn, t] ← a[n, t] + loss
12 v[n + jn, t] ← u[n, t] ∪ {(id, “kill”)} To evaluate our proposal, we study the traces of Argonne’s
13 if t + ta <= T ∧ b[n + jn, t + ta] > a[n, t] then Theta pre-Exascale machine and extract three representative
14 b[n + jn, t + ta] ← a[n, t] scenarios that create a challenging situation with respect to
15 v[n + jn, t + ta] ← u[n, t] ∪ {(id, “app”)} the eviction problem: most of the nodes are occupied by a
16 if t + ts <= T ∧ b[n + jn, t + ts] > a[n, t] then relatively large number of batch jobs, leading to many possible
17 b[n + jn, t + ts] ← a[n, t] combinations that need to be explored. For each scenario,
18 v[n + jn, t + ts] ← u[n, t] ∪ {(id, “sys”)} we augment the traces with additional data that enables us
19 a←b to apply our model in order to extract the parameters of
20 u←v each batch job: compute loss and application-level/system-
21 for i ∈ [0 . . . T ] do level checkpointing duration. We then compare our dynamic
22 (x, y) ← argmin(a[x = K . . . M, y = 0 . . . i]) programming algorithm with two other approaches: a greedy
23 Result[i] ← (a[x, y], u[x, y]) algorithm (linear complexity) and a backtracking algorithm
24 return Result that performs an exhaustive search (exponential complexity).
For the rest of this section, we introduce the methodology of
our proposal and discuss the results of the comparison.

T , thereby limiting the set of valid choices for job N (e.g., A. Batch job traces
no further checkpointing is possible within T ). In this paper, we consider the case of Argonne’s Theta
As a consequence, we propose a two-dimensional decom- supercomputer, a 11.69 petaflops pre-Exascale Cray XC40
position based on both the number of nodes and the deadline. system based on the second-generation KNL Intel Xeon Phi
We denote with the tuple hjnN , lossN , tsN , taN i the number 7230 SKU. The system has 4392 nodes, each equipped with
of nodes, loss of progress due to job killing, system-level 64 core processors (256 hardware threads), 16 GB of high-
checkpointing duration, and application-level checkpointing bandwidth MCDRAM (300-450 GB/s), 192 GB of main mem-
duration for job N . Then, the minimum loss for N jobs, M ory (DDR4 RAM, 20 GB/s), and a 128 GB SSD (700 MB/s).
nodes, and deadline T denoted as a[N, M, T ] is the lesser of: The interconnect topology is based on Dragonfly with a
(1) ignore job N , i.e., a[N − 1, M, T ]; (2) kill job N , i.e., total bisection bandwidth of 7.2 TB/sec. Durable storage is
lossN + a[N − 1, M − jnN , T ]; (3) take an application-level provided by a Lustre parallel file system that is accessible to
checkpoint of job N , i.e., a[N −1, M −jnN , T −taN ]; and (4) the compute nodes through a POSIX mount point. The total
take a system-level checkpoint of job N , i.e., a[N − 1, M − aggregated bandwidth is 250 GB/s.
jnN , T − tsN ]. Algorithm 1 presents our approach to solve First, we study the DIM_JOB_COMPOSITE trace1 of batch
this decomposition with a runtime of O(N ·M ·T ). The output jobs executed on Theta between 2017 and 2019. Specifically,
of this algorithm is a list Si for 0 < i < T , where Si is the set we extract for each job the required fields pertaining to the
of jobs to be evicted using an optimal strategy such that the runtime (start time, execution time) and the number of nodes.
compute loss is minimized, and, in case of multiple solutions Then, we aggregate this information to obtain the number of
with minimal compute loss, the checkpointing overhead is batch jobs and the number of nodes utilized by the batch jobs
minimized too. per time unit. We focus in particular on the year 2019, which
We note that Algorithm 1 uses a temporary minimum loss reflects the most recent utilization pattern: a total of 91,217
matrix b and a corresponding solution matrix v to hold the batch jobs were executed during the entire year.
updates resulting from considering all alternatives for job We zoom on the node utilization (Figure 1a) and the number
id. This is needed in order to avoid repeatedly selecting of jobs (Figure 1b) per hour during January 2019. A similar
the same id in subsequent decompositions. Furthermore, the
application-level checkpointing strategy takes precedence over 1 https://reports.alcf.anl.gov/data/theta.html

170
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

4000 25

20
Num of nodes

3000

Num of jobs
15

2000
10

1000 5

0 Max util: 4392 0


2019-01-01 2019-01-11 2019-01-21 2019-02-01 2019-01-01 2019-01-11 2019-01-21 2019-02-01
Time (h) Time (h)
(a) Node utilization (b) Number of jobs
Fig. 1: Trace analysis for January 2019.

Fig. 2: Scenario-1: 12 batch jobs running on 4352 nodes.

pattern can be observed for the rest of the year. These figures Fig. 3: Scenario-2: 16 batch jobs running on 4352 nodes.
reveal several interesting observations:
• During the entire year, all 4392 nodes were occupied
for only around 1.5 days. However, the most frequent
number of occupied nodes is 4352, which is close to the
maximum capacity and appears for a total of 34 days.
Therefore, the likelihood of having to run on-demand jobs
when the machine runs batch jobs close to full capacity
is very high.
• When the machine is operating close to capacity (4352
occupied nodes), the number of jobs is relatively high,
peaking at around 25 jobs.
• About 61% of the batch jobs reported an execution time
of less than 30 minutes. We consider these batch jobs Fig. 4: Scenario-3: 24 batch jobs running on 4352 nodes.
expendable, such that killing them incurs negligible loss. a series of synthetically generated checkpointing information
Based on these observations, we construct three representa- based on empirical observations. Specifically, we assume
tive scenarios, each of which occupies 4352 nodes at moment that all batch jobs conduct application-level checkpoints at
t0 and a variable number of jobs: 12, 16, 24. We deliberately an hourly interval. Since the batch jobs have different start
avoid expendable jobs in these scenarios (i.e., no expendable times, the likelihood that their application-level checkpoints
job is running at moment t0 ) in order to create a challenging are written concurrently to the PFS is very small. Furthermore,
situation where all jobs may incur a significant loss of node- we assume that each batch job allocates between 40%-90%
hours. The scenarios are illustrated in Figure 2, Figure 3 and of the memory available on each node. In this case, the
Figure 4. The regions of interest during which all jobs are size of the system-level checkpoint on each node coincides
running are marked with between two vertical timestamps with the allocated memory. Out of this memory, we assume
relative to the beginning of the earliest job. For example, 20%-60% holds critical data structures that are written by
Figure 2 captures a scenario of 12 jobs running for a total application-level checkpointing approaches. This is the size of
of 10 hours and 15 minutes, where, all batch jobs overlap for the application-level checkpoints. We use a random threshold
about 2 hours, i.e., from 02:41 to 04:52. The moment t0 is for each batch job, both for the application-level and system-
chosen within these regions of interest. level checkpoints, which is then used in Equation 1 to calculate
the application-level and system-level checkpointing duration.
B. Augmentation of the traces with checkpointing parameters
The DIM_JOB_COMPOSITE trace does not capture any C. Compared approaches
information about the checkpointing behavior of the batch Throughout our evaluations, we compare three approaches
jobs. Lacking such information, we augment the scenarios with that can be used to solve the eviction problem:

171
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Dynamic Dynamic
Greedy 105 Greedy
104 Back-tracking Back-tracking

Response time (μs)

Response time (μs)


Response time (μs)

103
104
103
103
102
102
102
Dynamic
101
Greedy 101
Back-tracking
101
0 3 6 9 12 15 0 3 6 9 12 15 0 3 6 9 12 15
Deadline time (m) Deadline time (m) Deadline time (m)
(a) Evict at least K=512 nodes for on-demand jobs (b) Evict at least K=1024 nodes for on-demand jobs (c) Evict at least K=2048 nodes for on-demand jobs
Fig. 5: Response time for Scenario-1 consisting of 12 batch jobs. Note the log scale on the Y axis. Lower is better.

104 105 Dynamic Dynamic


Greedy Greedy
Back-tracking 106 Back-tracking

Response time (μs)

Response time (μs)


Response time (μs)

104 105
103
103 104
102 103
102
Dynamic 102
Greedy
101 Back-tracking 101 101
0 3 6 9 12 15 0 3 6 9 12 15 0 3 6 9 12 15
Deadline time (m) Deadline time (m) Deadline time (m)
(a) Evict at least K=512 nodes for on-demand jobs (b) Evict at least K=1024 nodes for on-demand jobs (c) Evict at least K=2048 nodes for on-demand jobs
Fig. 6: Response time for Scenario-2 consisting of 16 batch jobs. Note the log scale on the Y axis. Lower is better.

1) Greedy: This algorithm implements a greedy strategy need to be evicted in order to make room for the on-demand
that tries to minimize the loss by checkpointing the most ex- jobs, i.e., 512, 1024, and 2048 for Scenario-1, Scenario-2, and
pensive jobs (high loss), while killing the least expensive jobs Scenario-3, respectively. This roughly corresponds to 12.5%,
(low loss). To this end, it sorts the batch jobs in descending 25% and 50% of the total capacity of Theta.
order of loss and tries to checkpoint them using the fastest
E. Results
available checkpointing method (application or system level).
When the total checkpoint duration becomes larger than the First, we focus on the performance and scalability of the
deadline T , it iterates over the sorted jobs in reverse order three approaches. To this end, we measure the response time
starting from the end, killing them one by one until at least taken by each approach in order to produce the optimal
K nodes have been released. While it does not produce an eviction strategy for all deadlines in the range [0 . . . T ]. As
optimal solution, this algorithm has linear complexity and a consequence, in the case of Greedy and Backtracking, a
therefore has a very fast response time. separate run is executed for each i ∈ [0 . . . T ]. Therefore, for
2) Backtracking: This algorithm implements an exhaustive an increasing i, the response time measures the accumulated
search of all possible choices for each batch job: keep running runtime of all i runs. In the case of CoSim, a single run is
(exclude), kill, checkpoint at application-level, checkpoint at sufficient to obtain the full solution thanks to the memoization
system-level. It optimizes the search by early abandoning of of overlapping sub-problems. This metric is important because
all combinations that cannot achieve a lower loss than the best it determines how soon the scheduler can take decisions, which
combination found so far. Unlike Greedy, this approach always in turn impacts the value that can be extracted from the on-
produces an optimal solution, however it has an exponential demand jobs (i.e., faster response time leads to better on-
complexity and therefore may become untractable for large demand job results).
problem sizes. The results for each of the three scenarios are depicted in
3) CoSim: This is our proposal that implements Algo- Figure 5, Figure 6 and Figure 7 respectively. Note that due
rithm 1. It guarantees an optimal solution just like Backtrack- to the large differences in algorithmic complexity between the
ing, but at the same time it has a fast response time thanks to three approaches, the y-axis is represented in a log scale. As
its polynomial complexity. expected, CoSim keeps a constant response time regardless
of the deadline T . Despite the accumulation of response time
D. On-demand job configurations from an increasing number of runs, the Greedy approach is still
For each of the three scenarios mentioned in Section IV-A, at least 30x faster than the other two approaches thanks to its
we consider the maximum deadline T = 15 minutes. We are linear complexity. It is interesting to observe that for a small K
interested in the optimal eviction strategy for all deadlines and a small number of batch jobs (as illustrated by Scenario-
in the range [0 . . . 15] with a granularity of one minute. 1), Backtracking is faster than our approach. However, with
Furthermore, for each of the three scenarios, we consider three increasing K and number of batch jobs (as illustrated by
different values for K, the minimum number of nodes that Scenario-2 and Scenario-3), the limitation of the exponential

172
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Dynamic Dynamic Dynamic


Greedy 106 Greedy 108 Greedy
Back-tracking Back-tracking Back-tracking
Response time (μs)

Response time (μs)

Response time (μs)


104 105
106
103 104
103 104
102
102
102
101 101
0 3 6 9 12 15 0 3 6 9 12 15 0 3 6 9 12 15
Deadline time (m) Deadline time (m) Deadline time (m)
(a) Evict at least K=512 nodes for on-demand jobs (b) Evict at least K=1024 nodes for on-demand jobs (c) Evict at least K=2048 nodes for on-demand jobs
Fig. 7: Response time for Scenario-3 consisting of 24 batch jobs. Note the log scale on the Y axis. Lower is better.
Relative compute loss (node-hours)

Relative compute loss (node-hours)

Relative compute loss (node-hours)


Scenario-1 Scenario-1
125 400 Scenario-2 Scenario-2
Scenario-3 800 Scenario-3

100 300
Scenario-1
600
75 Scenario-2
Scenario-3 200
50 400
100
25 200
0 0
0 3 6 9 12 15 0 3 6 9 12 15 0 3 6 9 12 15
Deadline time (m) Deadline time (m) Deadline time (m)
(a) Evict at least K=512 nodes for on-demand jobs (b) Evict at least K=1024 nodes for on-demand jobs (c) Evict at least K=2048 nodes for on-demand jobs
Fig. 8: Relative compute loss of the Greedy approach relative to optimal solution produced by CoSim and Backtracking. Lower is better.

search becomes clearly visible, despite the aggressive early V. R ELATED W ORK
pruning optimization. In this case, our approach is up to Scheduling of batch and on-demand jobs for concurrent
five orders of magnitude faster. As a general conclusion, we execution where resources sharing is limited to each type of
observe that our approach has the advantage of providing the job has been widely studied [12]–[19] in the past. However,
optimal solution within a predictable constant time, which is not much work has been done for collocating both batch and
well suited for real-time scheduling decisions. on-demand jobs on the same set of resources [20]. SPRUCE
Next, we focus on the quality of the results of the Greedy (Special Priority and Urgent Computing Environment) [21]
approach. Since both our approach and the Backtracking supports on-demand jobs by considering a basic preemptive
approach produce the optimal solution, we use it as a baseline scheduling scheme with no checkpointing. However, this leads
that we substract from the minimum loss found by the Greedy to a significant loss of progress for the batch jobs.
approach. We call this the relative compute loss. This metric is Checkpointing based preemptive scheduling has been tradi-
important, because it indicates what result quality degradation tionally used at the operating system level for multi-tasking.
can be expected in order to benefit from faster response time. However, recent checkpointing-based preemptive scheduling
schemes [13], [22] focus on reducing their overheads and
As can be observed in Figure 8, the relative compute loss improving their effectiveness in reducing the average job
is very high, indicating that degradation in the quality of turnaround time. Nevertheless, these techniques do not directly
the result found by Greedy is unacceptable. In fact, with the address the challenges of co-scheduling batch and on-demand
exception of T > 13 for Scenario-3, the relative compute loss jobs in HPC settings.
is increasing for an increasing T , which means Greedy suffers Large-scale datacenters operated by industry (e.g., Face-
from an increasing degradation in the quality of the result. book [23] and Google [24]), leverage centralized job execution
Also, it is important to note that in absolute terms, the mini- environments where the centralized system accumulates jobs
mum compute loss is decreasing with an increasing T for all from multiple datacenters, and then runs the computation [25].
three approaches, because more checkpointing opportunities However, it leads to increased network traffic and job com-
become available. In fact, the optimal minimum loss is found pletion time when the data volume grows exponentially [26],
by CoSim and Backtracking is often 0 (meaning no job needs [27]. Furthermore, regulations may restrict moving data across
to be killed), especially for larger T . Therefore, even when the continents due to security and privacy constraints, thus making
relative compute loss seems to decrease for an increasing T , such approaches impractical to adopt in production environ-
it is still missing the optimal compute loss by a large margin. ments at large.
Based on this observation, we conclude that sacrificing
the result quality for faster response time is not beneficial, VI. C ONCLUSIONS
especially when considering that our approach runs in the In this paper, we present CoSim, a simulator that enables on-
order of milliseconds in the worst case. the-fly analysis of the trade-offs arising between delaying the

173
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

start of opportunistic on-demand jobs, which leads to longer [10] B. Nicolae, A. Moody, E. Gonsiorowski, K. Mohror, and F. Cappello,
analytics latency, and loss of progress due to preemption of “Veloc: Towards high performance adaptive asynchronous checkpointing
at large scale,” in IEEE International Parallel and Distributed Process-
batch jobs, which is necessary to make room for such on- ing Symposium (IPDPS), Rio de Janeiro, Brazil, 2019, pp. 911–920.
demand jobs. The key idea of our proposal is to implement [11] J. Daly, “A higher order estimate of the optimum checkpoint interval for
preemption through a combination of either killing or check- restart dumps,” Future Generation Computer Systems, vol. 22, no. 3, pp.
303 – 312, 2006.
pointing (at application-level or system-level) a subset of batch [12] R. Tyagi and S. K. Gupta, “A survey on scheduling algorithms for par-
jobs running on the compute nodes to free enough nodes by allel and distributed systems,” in Silicon Photonics & High Performance
a given deadline. To this end, we introduce a checkpointing Computing. Singapore: Springer, 2018, pp. 51–64.
[13] V. J. Leung, G. Sabin, and P. Sadayappan, “Parallel job scheduling
and loss model to develop a dynamic programming algorithm policies to improve fairness: A case study,” in International Conference
to minimize the loss for a variable deadline up to a given on Parallel Processing Workshops (ICPP), San Diego, USA, 2010, pp.
threshold, which gives the scheduler high flexibility in ex- 346–353.
[14] A. A. Chandio, K. Bilal, N. Tziritas, Z. Yu, Q. Jiang, S. U. Khan, and C.-
ploring a wide range of alternatives. CoSim finds the optimal Z. Xu, “A comparative study on resource allocation and energy efficient
solution up to 5 orders of magnitude faster than backtracking job scheduling strategies in large-scale parallel computing systems,”
approaches and offers a predictable response time in the Cluster computing, vol. 17, no. 4, pp. 1349–1367, 2014.
[15] A. W. Mu’alem and D. G. Feitelson, “Utilization, predictability, work-
order of milliseconds, thereby eliminating the need for greedy loads, and user runtime estimates in scheduling the ibm sp2 with
approaches that are fast but find only approximate solutions. backfilling,” IEEE Transactions on Parallel and Distributed Systems
In the future, we plan to investigate several avenues: (1) ap- (TPDS), vol. 12, no. 6, pp. 529–543, 2001.
[16] C. Gómez-Martín, M. A. Vega-Rodríguez, and J.-L. González-Sánchez,
plicability of our proposal to cloud computing; (2) refinement “Fattened backfilling: An improved strategy for job scheduling in par-
of checkpointing model (interval, interactions with PFS); (3) allel systems,” Journal of Parallel and Distributed Computing (JPDC),
integration with the workload schedulers at Argonne National vol. 97, pp. 69–77, 2016.
[17] B. Lawson and E. Smirni, “Multiple-queue backfilling scheduling with
Laboratory’s supercomputers to validate CoSim for real-life priorities and reservations for parallel systems,” ACM SIGMETRICS
on-demand workloads. Performance Evaluation Review, vol. 29, pp. 72–87, 2002.
[18] A. Tousimojarad and W. Vanderbauwhede, “An efficient thread mapping
ACKNOWLEDGMENTS strategy for multiprogramming on manycore processors,” Parallel Com-
This material is based upon work supported by the U.S. puting: Accelerating Computational Science and Engineering (CSE),
Advances in Parallel Computing, vol. 25, pp. 63–71, 2014.
Department of Energy (DOE), Office of Science, Office of [19] S. G. Ahmad, C. S. Liew, M. M. Rafique, E. U. Munir, and S. U. Khan,
Advanced Scientific Computing Research and Argonne Na- “Data-intensive workflow optimization based on application task graph
tional Laboratory. Results presented in this paper are obtained partitioning in heterogeneous computing systems,” in IEEE International
Conference on Big Data and Cloud Computing (BdCloud), 2014, pp.
using the Chameleon and CloudLab testbeds supported by the 129–136.
National Science Foundation. [20] D. Wang, E.-S. Jung, R. Kettimuthu, I. Foster, D. J. Foran, and
M. Parashar, “Supporting Real-Time Jobs on the IBM Blue Gene/Q:
R EFERENCES Simulation-Based Study,” in Job Scheduling Strategies for Parallel
[1] W. Tang, B. Want, S. Ethier, and Z. Lin, “Performance portability of Processing, D. Klusáček, W. Cirne, and N. Desai, Eds. Orlando, USA:
hpc discovery science software: Fusion energy turbulence simulations at Springer International Publishing, 2018, pp. 83–102.
extreme scale,” Supercomputing frontiers and innovations, vol. 4, no. 1, [21] N. Trebon, “Enabling urgent computing within the existing distributed
2017. computing infrastructure,” Ph.D. dissertation, University of Chicago,
[2] A. S. Kozelkov, V. V. Kurulin, S. V. Lashkin, R. M. Shagaliev, and A. V. USA, 2011.
Yalozo, “Investigation of supercomputer capabilities for the scalable [22] Q. Snell, M. Clement, and D. Jackson, “Preemption based backfill,” in
numerical simulation of computational fluid dynamics problems in Job Scheduling Strategies for Parallel Processing. Berlin, Heidelberg:
industrial applications,” Computational Mathematics and Mathematical Springer, 2002, pp. 24–37.
Physics, vol. 56, no. 8, pp. 1506–1516, 2016. [23] J. Meza, T. Xu, K. Veeraraghavan, and O. Mutlu, “A large scale
[3] P. Vranas, G. Bhanot, M. Blumrich, D. Chen, A. Gara, P. Heidelberger, study of data center network reliability,” in ACM Internet Measurement
V. Salapura, and J. C. Sexton, “The bluegene/l supercomputer and Conference (IMC), New York, USA, 2018, p. 393–407.
quantum chromodynamics,” in ACM/IEEE International Conference for [24] S. Jain, A. Kumar, S. Mandal, J. Ong, L. Poutievski, A. Singh,
High Performance Computing, Networking, Storage and Analysis (SC), S. Venkata, J. Wanderer, J. Zhou, M. Zhu, J. Zolla, U. Hölzle, S. Stuart,
Tampa, Florida, 2006, pp. 50–57. and A. Vahdat, “B4: Experience with a Globally Deployed Software
[4] S. R. Ellingson, J. C. Smith, and J. Baudry, “Polypharmacology and Defined WAN,” in ACM SIGCOMM, Hong Kong, China, 2013.
supercomputer-based docking: opportunities and challenges,” Molecular [25] J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. J. Furman,
Simulation, vol. 40, no. 10-11, pp. 848–854, 2014. S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kan-
[5] D. AOCNP, “Watson will see you now: a supercomputer to help clini- thak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle,
cians make informed treatment decisions,” Clinical journal of oncology S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor,
nursing, vol. 19, no. 1, p. 31, 2015. R. Wang, and D. Woodford, “Spanner: Google’s globally distributed
[6] A. B. Yoo, M. A. Jette, and M. Grondona, “Slurm: Simple linux utility database,” ACM Transactions on Computer Systems (TOCS), vol. 31,
for resource management,” in Job Scheduling Strategies for Parallel no. 3, 2013.
Processing. Berlin, Heidelberg: Springer, 2003, pp. 44–60. [26] A. Vulimiri, C. Curino, P. B. Godfrey, T. Jungblut, J. Padhye, and
[7] N. Desai, “Cobalt: an open source platform for hpc system software G. Varghese, “Global analytics in the face of bandwidth and regulatory
research,” in Edinburgh BG/L System Software Workshop, 2005, pp. 803– constraints,” in USENIX Networked Systems Design and Implementation
820. (NSDI), USA, 2015, p. 323–336.
[8] G. Staples, “Torque resource manager,” in ACM/IEEE International [27] S. Muralidhar, W. Lloyd, S. Roy, C. Hill, E. Lin, W. Liu, S. Pan,
Conference for High Performance Computing, Networking, Storage and S. Shankar, V. Sivakumar, L. Tang et al., “f4: Facebook’s Warm
Analysis (SC), New York, NY, USA, 2006, p. 8–es. BLOB Storage System,” in USENIX Operating Systems Design and
[9] J. Ansel, K. Arya, and G. Cooperman, “DMTCP: Transparent check- Implementation (OSDI), 2014, pp. 383–398.
pointing for cluster computations and the desktop,” in IEEE Interna-
tional Symposium on Parallel & Distributed Processing (IPDPS), Rome,
Italy, 2009, pp. 1–12.

174
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Toward secure, efficient, and seamless


reconfiguration of UAV swarm formations
Jamie Wubben1 , Pablo Aznar1 , Francisco Fabra 1 ,
Carlos T. Calafate1 , Juan-Carlos Cano1 , Pietro Manzoni1
1
Departament of Computer Engineering (DISCA)
Universitat Politècnica de València, Valencia, Spain
Email: jwubben@disca.upv.es, pabazcol@alumni.upv.es, frafabco@cam.upv.es,
calafate@disca.upv.es, jucano@disca.upv.es, pmanzoni@disca.upv.es

Abstract—Unmanned Aerial vehicles (UAVs) have gained a lot tions have great benefits as they are generally able to perform
of interest over the last years due to the many fields of potential more sophisticated tasks efficiently or with more redundancy.
application. Nowadays, researchers are becoming interested in However, organizing a multi-UAV flight is not an easy task,
groups of UAVs working together. The collaborations between
UAVs open a wide field of opportunities, because they are with challenges in terms of (i) swarm formation definition,
typically able to do more sophisticated tasks than a single (ii) takeoff procedure, (iii) in-flight coordination, (iv) swarm
UAV. However, collaboration between multiple UAVs is still a layout reconfiguration, (v) handling the loss of swarms ele-
complex task, and significant challenges need to be addressed ments, (vi) communications and data relaying optimization,
before their mainstream adoption. For instance, the automatic and (vii) controlled landing, among others. In this work we
reconfiguration of a swarm can be used to adapt the swarm to
changing application demands to solve a task in a more efficient focus on the particular problem of swarm reconfiguration
and effective manner. However, the chances of collision become during a mission. Notice that the ability to automatic change
high if reconfiguration is not carefully planned. In this work we the shape of a formation during a mission can become very
propose an approach to allow changing the shape of a UAV useful in different kinds of applications to account for: variable
formation during flight through a computational inexpensive application requirements, coping with the loss of swarm ele-
method that is able to decrease collision chances significantly.
During the experiments we tested different reconfiguration events ments, handling temporary flight restrictions, etc. For instance,
that are prone to collisions. Results have shown that our approach consider a search and rescue mission where at first a swarm
maintains a safe distance (greater than 5 meters) between the has to cover a large area but, upon discovering the item of
UAVs, while keeping the time overhead limited to a few tenths interest, the swarm needs to reconfigure to better monitor that
of a second. Furthermore, scalability tests have proven that our area and provide different services.
approach can handle the reconfiguration of at least 25 UAVs
simultaneously. The main issue that we face during a reconfiguration is the
Index Terms—UAV; swarm reconfiguration; swarm formations chance of collisions, especially when the number of UAVs
becomes larger. In this work we focus on a computational
inexpensive technique to reduce the chances of collision that
I. I NTRODUCTION can be deployed easily under various conditions. Our solution
combines two algorithms, the first determines the optimal
Over the last decade the field of Unmanned Aerial Vehicles assignment of UAVs in the new formation accounting for
(UAVs) has gained universal interest and novel applications their current position, while the second one splits the UAVs
keep emerging every year. Due to the ever decreasing price in different mobility groups that are shifted to different alti-
of technology, UAVs (also known as drones) are becoming tudes during the reconfiguration process to minimize collision
mainstream for the general public and industry as well. This risks. Experimental results show that our solution is able to
results in many civilian applications in aerial photography and minimize collisions risks compared to other alternatives, while
video, topography, entertainment, etc. [1]. More professional introducing only a moderate reconfiguration delay.
applications such as precision agriculture, border surveillance, The rest of this paper is organized as follows: in Section II
package delivery, and thermal inspections are also common in we provide an overview of related works on this topic. In Sec-
the industry [2], [3]. Nowadays, UAVs are starting to be used tion III we detail our implementation. This implementation is
to assist in emergency situations such as search and rescue, or then tested through different experiments, which are presented
disaster scenarios [4], [5], where they can act as supporting and discussed in Section IV. This work finishes with a critical
nodes for communications being deployed on demand, and discussion and the obtained conclusions in Section V.
offering a wider communications range and better line-of-sight
(LOS) features than ground infrastructures. II. R ELATED WORK
Over the last few years, the research works shifted more The research towards swarms of UAVs has experienced
towards groups of coordinated UAVs [6]. Multi-UAV applica- a growing interest in recent years. The particular topic of
978-1-7281-7343-6/20/$31.00 2020
c IEEE flight configurations has been investigated by different authors.

175
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

The work by V.T. Hoang et al. [7] presents an algorithm can be used both during the day and during the night.
to reconfigure a formation of multiple UAVs. This work Indoor and outdoor experiments performed in obstacle rich
is especially focused on the application of vision-based in- environments have proven the effectiveness of the proposed
spection of infrastructure. It presents a new algorithm for method. Furthermore, their software is implemented in the
reconfiguration based on the angle-encoded Particle Swarm robot operating system (ROS) [11], which promotes reusability
Optimization (PSO). They begin with a 3D representation of through its modular design.
the surface to be inspected and a set of intermediate waypoints. While flocking mechanisms are great to keep a swarm of
Additionally new constraints are proposed to decrease the UAVs organized, they do not provide the flexibility to com-
chance of collision and increase task performance; based on pletely define and change the formation itself. In many appli-
the assumption that an optimal path is produced by using the θ- cations, it is useful to change the formation (for instance, from
PSO path planning algorithm. Their work differs form ours as a line to a circle); however it is difficult to encapsulate such
they use just a limited number of reconfigurations. They only behaviour using flocking mechanisms. Therefore, in our work,
focus on alignment, rotation and shrinkage, while our proposal we specifically focus on changing between different flight
is able to change the entire topology of the formation. formations. Hence, instead of using a flocking mechanism,
Other works use an approach which is called flocking. we propose a master-slave model where the master instructs
Flocking is a behaviour that is common in nature, for instance the slaves how to safely accomplish the reconfiguration.
in a group of fish, birds or insects. It consists of a few
basic rules that are applied to each entity of the group. When III. P ROPOSED MECHANISM
those rules are respected, the group will stay united without The aim of this work is to reconfigure a swarm of UAVs,
collisions between the group elements. There are various seamlessly switching from one flight formation to another.
methods to achieve a flocking behaviour for a group of UAVs, In our approach we make use of a master-slave pattern. The
as discussed below. master is elected before taking off, as described in our previous
In the work by Ming Chen et al.[8] a flocking model for work [12]. The master is in charge of the main calculations,
an UAV network based on swarm intelligence is presented. In and keeps the swarm synchronized throughout the different
their work they propose a set of rules to make sure that the stages of the reconfiguration. All the stages are described
slaves will follow the master while maintaining a certain safe in Figure 1. The protocol starts with the UAVs taking off
distance from the master. They cannot get too close because and following a mission. The reconfiguration will start upon
this behaviour increases the chances of collisions; also, they a trigger event, which can be an user input or an event
cannot get too far away, because otherwise communication predefined in the ground control station. The reconfiguration
will be lost. Simulation results show that their model can itself is divided into two stages: an analysis step where the
guarantee connectivity between nodes, and it will also improve calculations are done, and a mobility step where the UAVs
bandwidth usage. move to their target locations in an intelligent manner to avoid
Victor Casas et al. [9] developed a flocking model without collisions. After the swarm has reconfigured itself, the mission
the use of a master-slave model. The UAVs in the swarm can continue. The protocol finishes at the end of the mission
regularly broadcast and receive movement information. That by landing all the UAVs.
information is then used to calculate two forces: a flock goal
A. Phase 1: Analysis
force, which guides the flock towards the target location and
aligns the swarm members, and a flock members force, which In a previous work we developed an algorithm to determine
provides cohesion and separation to the flock. Those two who the master should be in the scope of a UAV swarm [12].
forces are used to update a direction vector which points To understand our current proposal it is only relevant to
towards the target location, while at the same time avoids know that a single master is assigned, and that it will always
collisions. Their model is tested in simulation and in real be located in a central position on the flight formation to
experiments which show that a collision-free flight is ensured. minimize losses on the wireless channel. In this first step,
They tested the model under various speeds, although all of the master decides the slaves positions (later referred to as
them were rather slow (a maximum of 3m/s). Results also intelligent position). The idea is that the overall flight distance
showed that, during real experiments, the minimum distance is minimised by choosing the UAV that is already closest to a
between UAVs is decreased; according to the authors, this is new flight position to fly to it. This algorithm is also explained
due to GPS inaccuracy. in more detail in [12]. Basically, it consist of the following four
Yazhe Tang et al. [10] presented a swarm flocking scheme steps:
that was able to work in a radio silent environment. In contrast 1) Find a central location with respect to the current location
to many other works, their approach was not based on sending of the UAVs.
(GPS) information between the swarm elements. They used 2) Calculate the euclidean distances from that central loca-
two types of vision sensors (standard and thermal cameras) to tion to the positions in the new flight formation.
track their leader, and a LiDAR sensor to sense the surrounding 3) Sort this list in descending order.
environment for navigation and obstacle avoidance. Because 4) Assign each location in the flight formation to the closest
they used various high-end sensors, their flocking mechanism UAV.

176
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

locations, these are associated to the new flight formation as


specified by the controller. It is worth pointing out that, in
the mobility stage, the UAVs will fly at different altitudes to
reduce the chance of collisions. So, the next thing the master
needs to do is to decide which UAV flies at which altitude.
That process is fully described in Algorithm 1. It details how
the master calculates (for each UAV) in what direction it has to
go. Based on that direction, the UAVs are placed into different
sectors. Each sector has a different altitude assigned to it (in
our experiments we use a simple 5 meter increment from one
sector to the next). In this manner, UAVs that are likely to
cross each other’s path will fly at different altitudes, and thus
we decrease the chances of collisions. Note that this algorithm
does not guarantee a collision free reconfiguration. Once the
calculations are done, the master will start sending messages
with the target location (x,y,∆z) to all slaves. Upon receiving
this message, the slaves will reply with acknowledgements,
and, once all the slaves have received their target location, the
swarm will transition to the mobility step.

Algorithm 1 Section select procedure


Require: numberOf Sections > 0

1: for UAV in UAVs do


2: ∆x ← U AV.targetLoc.x − U AV.startLoc.x
3: ∆y ← U AV.targetLoc.y − U AV.startLoc.y
4: α ← atan2(∆y, ∆x)
5: if α < 0 then
6: α=α+2×π
7: end if
2×π
8: sectorW idth = numberOf Sections
9: sector ← 0
10: for i in range(0, numberOfSections) do
11: min ← i × sectorW idth
12: max ← (i + 1) × sectorW idth
13: if min ≤ α < max then
14: Sector = i
15: end if
16: end for
17: end for

B. Phase 2: Mobility
The mobility step is split up into three states: first the UAVs
will change altitude, depending on his sector as explained
in the previous section (movement in the Z direction), then
they will go towards their target location (X,Y movement),
Fig. 1: Flowchart of the flight formation algorithm and finally they will return to their initial altitude (return to
default Z value). In each state the master will send messages to
the slaves. When a slave receives the message it will perform
While the algorithm was originally designed to ensure a the movement and reply with an acknowledgement once the
safe and fast takeoff procedure, we were able to reuse it for movement is finished. The master receives the acknowledge-
our current swarm reconfiguration purposes. ments and, when all the slaves have sent an acknowledge
In order for the master to execute this algorithm it needs message (and the master has reached its position), the master
to know where all the UAVs currently are, and what new will transition to the next state. At that moment, the master
locations are defined in the new swarm layout. The current will start sending messages from his new state; slaves will
locations are known by the master since it defines and main- receive those messages, and transition too. The messages sent
tains the swarm topology at all times. Regarding the new by the master only contain an id which represents the current

177
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

state. They do not have to contain the location information C. Intelligent positioning, no altitude change.
because this was already sent in phase 1. D. Intelligent positioning, different altitudes.
As a final remark, it is worth pointing out that our proposal In our first set of experiments, 9 UAVs changed from
is computational efficient. Algorithm 1 is the only element a linear formation towards a compact mesh formation (see
with significant computational requirements, and it limited to Figure 2 for an example). The minimum distance between
a O(N 2 ). Since in most practical applications the number of UAVs in that formation was set to 10 meters, the number
UAVs in a swarm will be low (below 100), this algorithm can of sectors was equal to three and the altitude difference
be easily executed on the UAV’s onboard computer, such as a between sectors was of 5 meters. These variables can be
Raspberry Pi. Also the network will not be overloaded since set by the user. In real experiments, and in some network
the message payloads are quite small. models in our simulator, UAVs cannot communicate when the
distance between them is greater than 1200 meters. Therefore
IV. E XPERIMENTAL SETTINGS AND RESULTS
although possible in simulation, the distance between the
We performed a wide set of experiments in our own UAVs must not be set too large. For that reason, we have
UAV emulator/simulator in order to assess the validity and chosen the above mentioned values because they are realistic
robustness of our proposed mechanism. Before providing a and provide enough clearance to prevent UAVs from colliding
detailed explanation about our experiments and the results due to GPS errors, wind gusts, etc. During the experiments we
obtained, we will briefly discuss our simulator environment measured the time that the UAVs spent in each state ( Move Z,
called ArduSim. Move XY, Move Z Initial), the minimum distance between
the UAVs during the Move XY state, and the potential number
A. Ardusim
of collisions. A collision happens when the euclidean distance
ArduSim is multi-UAV flight simulator/emulator; it is avail- between two UAVs in our experiments is smaller than 5 meters
able online [13] under the Apache License 2.0. The simulator to account for the GPS offset error.
has many features, which are fully explained in our previous
work [14]. Here, we will just highlight some of the key
characteristics.
First of all, ArduSim makes it easy, fast and reliable to
deploy a protocol that was developed in the simulator to real
UAVs. It does this mainly by implementing the same open
source protocols and standards that are used by the majority of
the UAVs. Besides that, ArduSim really is a multi-UAV flight
simulator; it is able to scale up to 100 UAVs in real time,
and up to 256 UAVs in soft real time on a high-end PC (Intel
Core i7-7700, 32 GB RAM). Wireless communication models,
based on real experiments, are implemented to support UAV- Fig. 2: Transition of 9 UAVs from a linear formation to a
to-UAV communications; notice that this is a basic require- compact mesh
ment for nearly all swarm applications. Furthermore, a lot of
basic UAV functionality (such as taking off, moving to a GPS The results are shown in Table I and Table II. Our experi-
location, etc.) is provided by the Application Programming ments have shown (as stated before) that merely changing the
Interface (API). The user is provided with a functional GUI formation layout without adopting any type of strategy is very
and extensive logging features. dangerous, and prone to cause collisions. We can also observe
Overall, ArduSim is a versatile tool that provides re- that just by changing the altitude or the position assignment
searchers the opportunity to quickly develop new applications of the UAVs in an intelligent manner is not enough to avoid
and protocols, without losing accuracy and/or customization. collisions in all cases. Only when both where used could
B. Safety analysis collisions be entirely avoided. Furthermore, while changing
the altitude does make the process safer, an additional time
Our approach combines an intelligent UAV assignment (see overhead is introduced. The time overhead depends on the
Section III-A) with a sectorization procedure that groups UAVs number of sectors and the altitude difference between the
moving with similar directions so that their mobility takes sectors, the impact of both parameters are discussed in more
place at different heights (see Section III-B). To assess the detail in the following experiments. Implementing a intelligent
effectiveness of this combined approach, we will compare it to positioning system reduces the overall flight distance and,
other (simpler variants) where such mechanisms are not used, therefore, flight times are slightly shorter in experiments C
so that we can evaluate which part has the most influence and D.
and if our approach (as a whole) is effective. Therefore, we
propose three other (but similar) approaches: C. Scalability
A. Random position assignment, no altitude change. In our second experiment we want to evaluate the scalability
B. Random position assignment, different altitudes. of our protocol. We searched for the minimal number of

178
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE I: Collisions and minimum distance analysis.


Ex Nr. collisions Min. Distance between UAVs
A 4 0.44
B 2 0.33
C 2 3.58
D 0 6.15

TABLE II: Time UAVs spend in each state.


Ex Move z [ms] Move XY [ms] Move Z ini [ms]
A 404 13607 400
B 6802 13030 7980
C 380 12425 400
D 8600 12415 8600
Fig. 4: Minimum number of sectors required for a collision-
free reconfiguration procedure.
sectors needed to complete a collision-free reconfiguration, for
different number of UAVs, and for different formations (see
Figure 3). All of the formations where prone to collisions due D. Differences between various transitions
to the small distance between the UAVs that was defined (≤
Due to our findings in the previous experiment, we investi-
10m). We started with 9 UAVs (as in the previous experiment),
gated the influence of the type of formation in greater detail.
and increased this value up to 25 UAVs. The results are shown
In particular, we tested all the possible transitions between
in Figure 4. As expected, the minimum number of sectors
the four flight formations considered (Linear, Matrix, Mesh,
required to guarantee a collision-free reconfiguration increases
Circle). The experimental settings are similar to the previous
with the number of UAVs. The rate of increase depends highly
ones. We worked with 15 UAVs in formations where the
on the type of formation.
distance between the UAVs is less than 10 meters. During the
experiment we searched for the minimal number of sectors
needed to complete a collision-free reconfiguration. We also
measured the time spent at each state. Results are shown in
Figures 5 and 6. As we can observe from Figure 5, results
vary significantly depending on the specific transition; in some
cases, such as going from a mesh to a matrix formation, just
a few sectors are needed. In other cases (e.g. matrix to linear)
the angles α calculated in Algorithm 1 are very similar, and
so many sectors are required in order to separate the UAVs
in different altitude groups. In the presence of many groups,
the target altitude can grow a lot, resulting in a high time
(a) Circle formation. (b) Matrix formation overhead (in the worst-case scenarios), as shown in Figure
6. Due to the similar shape of both figures, we can see the
correlation between the number of sectors and the overall
reconfiguration time. Furthermore, we can conclude that the
time spend moving in the xy-plane fluctuates only a little,
being limited to a maximum of 6.8 seconds in our experiments.

E. Time overhead

Finally, we further investigated the time overhead intro-


duced by changing UAV altitudes during reconfiguration. To
achieve this we start by finding the value of the one-way delay
T for which:
Z T
v(t)dt = D (1)
0
(c) Mesh formation (d) Linear formation
Fig. 3: Different types of formations. where
D = num.sectors × sectors of f set

179
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 5: Minimum number of sectors required for a collision- Fig. 7: Estimated time overhead vs real time overhead
free reconfiguration) w.r.t. the type of transition.

able to extend UAV-based applications by allowing work to


be done in parallel, with more redundancy, and providing the
ability to carry heavier loads. However, coordinating multiple
UAVs is not an easy task. In this work we focused specif-
ically on the reconfiguration of a swarm. This is a relevant
behaviour that can be used to make many applications more
efficient and/or effective. However, the chances of collision
become high during reconfiguration, and so it becomes an
issue that must to be dealt with. Our proposal is based on an
intelligent position assignment system that reduces the chances
of flight paths crossing during formation reconfiguration. The
chances of collision are further reduced by distributing the
Fig. 6: Time spent moving horizontally (state 2) and vertically UAVs over different altitude levels during the reconfiguration
(states 1,3). period. This simple, computationally efficient approach can
be easily applied to various environments. However, it is not
able to fully guarantee a collision-free reconfiguration in all
This one-way delay refers to both upward or downward cases, and when scaled-up to many UAVs the time overhead
movements. We can approximate this one-way time overhead introduced becomes significant. For these reasons, our future
T as: work will focus on more complex algorithms that combine
∼ D +
T = (2) path prediction with machine learning approaches to avoid
v̂0→T collisions in a more timely efficient manner.
where v̂0→T refers to the expected speed during the entire
ACKNOWLEDGMENTS
mobility from time 0 to T, and  accounts for the additional
time associated to acceleration and deceleration processes. This work was partially supported by the ”Ministerio de
In our experiments v̂0→T was set to 2 m/s, and the distance Ciencia, Innovación y Universidades, Programa Estatal de
between the sectors (sector of f set) to 5 meters. The number Investigación, Desarrollo e Innovación Orientada a los Retos
of sectors ranged between 2 and 8. Figure 7 compares the de la Sociedad, Proyectos I+D+I 2018”, Spain, under Grant
estimated time overhead for the two-way vertical mobility RTI2018-096384-B-I00.
against the real time overhead measured in our experiments. R EFERENCES
We can clearly observe a linear pattern (as suspected by the
derivation), and in our case the average value of  is of 1.5s. [1] G. S. Research, “Drones: Reporting for work.”
https://www.goldmansachs.com/insights/technology-driving-
While the speed of the UAVs does influence the time innovation/drones/, 2014. Accessed: 2020-06-04.
overhead directly, it does not alter the chances of a collision. [2] H. Shakhatreh, A. H. Sawalmeh, A. Al-Fuqaha, Z. Dou, E. Almaita,
I. Khalil, N. S. Othman, A. Khreishah, and M. Guizani, “Unmanned
This is because all the UAVs are flying at the same speed, and aerial vehicles (uavs): A survey on civil applications and key research
thus the distance between them will not change. challenges,” IEEE Access, vol. 7, pp. 48572–48634, 2019.
[3] S. Hayat, E. Yanmaz, and R. Muzaffar, “Survey on unmanned aerial
V. C ONCLUSIONS AND FUTURE WORK vehicle networks for civil applications: A communications viewpoint,”
IEEE Communications Surveys Tutorials, vol. 18, no. 4, pp. 2624–2661,
Research in the field of Unmanned Aerial Vehicles (UAVs) 2016.
has been significantly boosted over the last few years, and as a [4] P. Vincent and I. Rubin, “A framework and analysis for cooperative
search using uav swarms,” in Proceedings of the 2004 ACM Symposium
result we are now able to tackle more difficult problems such on Applied Computing, SAC ’04, (New York, NY, USA), pp. 79–86,
as multi-UAV coordinated flights. These so called swarms are ACM, 2004.

180
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

[5] M. Aljehani and M. Inoue, “Multi-UAV tracking and scanning systems Networks, Systems, and Applications, DroNet ’20, (New York, NY,
in M2M communication for disaster response,” in 2016 IEEE 5th Global USA), Association for Computing Machinery, 2020.
Conference on Consumer Electronics, pp. 1–2, Oct 2016. [10] Y. Tang, Y. Hu, J. Cui, F. Liao, M. Lao, F. Lin, and R. Teo, “Vision-
[6] A. Tahir, J. Böling, M.-H. Haghbayan, H. T. Toivonen, and J. Plosila, aided multi-uav autonomous flocking in gps-denied environment,” IEEE
“Swarms of unmanned aerial vehicles — a survey,” Journal of Industrial Transactions on Industrial Electronics, vol. PP, pp. 1–1, 04 2018.
Information Integration, vol. 16, p. 100106, 2019. [11] Stanford Artificial Intelligence Laboratory et al., “Robotic operating
[7] V. T. Hoang, M. D. Phung, T. H. Dinh, Q. Zhu, and Q. P. Ha, “Recon- system.”
figurable multi-uav formation using angle-encoded pso,” in 2019 IEEE [12] F. Fabra, J. Wubben, C. Calafate, J. Cano, and P. Manzoni, “Efficient and
15th International Conference on Automation Science and Engineering coordinated vertical takeoff of UAV swarms,” in IEEE 91st Vehicular
(CASE), pp. 1670–1675, 2019. Technology Conference (VTC2020-Spring), May 2020.
[8] M. Chen, F. Dai, H. Wang, and L. Lei, “Dfm: A distributed flocking [13] “ArduSim. accurate and real-time multi-UAV simulation.”
model for uav swarm networks,” IEEE Access, vol. 6, pp. 69141–69150, https://bitbucket.org/frafabco/ardusim/src/master/, 2017. Accessed:
2018. 2020-05-11.
[9] V. Casas and A. Mitschele-Thiel, “Implementable self-organized flock- [14] F. Fabra, C. T. Calafate, J.-C. Cano, and P. Manzoni, “ArduSim: Accurate
ing algorithm for uavs based on the emergence of virtual roads,” and real-time multicopter simulation,” Simulation Modelling Practice
in Proceedings of the 6th ACM Workshop on Micro Aerial Vehicle and Theory, vol. 87, pp. 170–190, sep 2018.

181
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

SEMRP: an Energy-efficient Multicast Routing


Protocol for UAV Swarms
Youssra Cheriguene∗ , Soumia Djellikh∗ , Fatima Zohra Bousbaa¶ , Nasreddine Lagraa¶ ,
Abderrahmane Lakas§ , Chaker Abdelaziz Kerrachek , and Abdou El Karim Tahari¶

University of Laghouat, Algeria, {firstnameinitial.lastname}.inf@lagh-univ.dz
¶ Computer Science Laboratory, University of Laghouat, Algeria, {firstnameinitial.lastname}@lagh-univ.dz
§ UAE University, Al Ain, UAE, alakas@uaeu.ac.ae
k University of Ghardaia, Algeria, ch.kerrache@univ-ghardaia.dz

Abstract—The deployment of a swarm of cooperative UAVs a swarm. The ability of swarming many UAVs to perform
applications for the execution of distributed tasks has increased complex tasks becomes attractively recommended because it
attention from both academia and industry researchers. The use solves the limitations of single UAV systems like the limited
of a group of UAVs instead of one single UAV offers many
advantages like extending the mission coverage, providing a payload and flight time. It also adds more functionalities and
reliable ad-hoc networks services, and enhancing the service advantages including time-savings, reduction in manpower,
performance, to name a few. However, due to the highly dynamic and operational expenses optimization. In a single UAV sys-
nature of the swarm topology, the coordination of a large tem, if the UAV or a sensor/hardware fails, the UAV should
number of UAVs poses new challenges to traditional inter-UAV return to the base. However, in swarm-based systems, other
communication protocols. Therefore, there is a need for the design
of new networking protocols that can efficiently support the UAVs can share tasks among themselves and this increases the
fast-pace and real-time requirements of a coordinated swarm fault tolerance of the system. For example, in search missions
navigation in various environments. In this paper, we propose using a swarm of UAVs can parallelize the individual tasks,
SEMRP a Swarm energy-efficient multicast routing protocol for thus, decreasing the completion time of the mission, extending
UAVs flying in group formations. The main purpose of SEMRP is the coverage range, and also providing real-time images and
to facilitate the control and information delivery between UAVs
while minimizing inter-UAV packet loss, packet re-transmission, videos which may improve the quality of the operation.
and end-to-end delay. In this study we show how SEMRP achieves Although the deployment of swarm UAVs and their at-
these objectives by taking into account various Quality-of-Service tractive advantages, it still poses several challenging issues
parameters like the network throughput, the UAVs mobility, and that may affect their reliability and stability. To support their
energy efficiency to ensure a timely and accurate information
delivery to all members of a UAV swarm. The results of the various applications and maintain their stable functioning, and
conducted simulation using NS-2 advocate for the efficiency of to well exploit their features, it is necessary to design efficient
our proposal through its to two presented versions (SEMRP-v1 routing protocols adapted to the targeted missions. To this
and SEMRP-v2) in term of reducing the total emission energy end, many swarm routing protocols have been proposed in
(at least by 10 dBm), optimizing the End-to-End Delay by 44%, the context of Flying Ad hoc NETworks (FANET), these
and increasing the packet delivery ratio by more than to 22%
compared to SP-GMRF protocol. works can be classified into three main classes: (i) bioinspired-
Index Terms—UAVs, Swarm of UAVs, Multicast Routing Pro- based [3]–[6] , (ii) geographical location-based [7], and (iii)
tocol, SEMRP. multicast-based [8, 9]. In this work, we focus on the last
category (i.e, multicast-based routing protocols) which offers
I. I NTRODUCTION more advantages such as reduced bandwidth utilization in data
distribution from a source to its group members. Solutions
Unmanned Aerial Vehicles (UAVs) have recently attracted presented in this category do not simultaneously address
significant interest in civilians and military applications, such the power consumption, reliability, and network scalability.
as search and rescue operations, managing wildfire, agricul- Therefore, we attempt to design a new efficient multicast
tural applications, patrolling, delivery of goods, monitoring routing protocol for swarm-based systems that distributes
and surveillance [1, 2]. Swarms of UAVs may further increase data from one source node to a specific group of mobile
the effectiveness of these tasks. For instance, the possibility to drones, while minimizing the number of connections in the
enable larger mission coverage and to improve the operation network, ensuring both the reliability and the scalability, and
performance through multi-UAV cooperation. optimizing the global energy consumption that directly affects
As the technology of cooperative UAVs grows and their the system’s lifetime.
cost decreases, they become an interesting way to undertake The proposed solution can be applied in COVID-19 applica-
several difficult applications, especially when the drones form tions such as surveillance for purposes like social distancing
978-1-7281-7343-6/20/$31.00
c 2020 IEEE violation detection, in addition to various other applications

182
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

like spraying areas with disinfectants. The use of swarm of Despite its increased performance over traditional FANET
UAVs is highly beneficial for these types of applications, due routing algorithms in most cases, BeeAdHoc is characterized
to low operational costs and high spatial resolution of imagery, by a complex behavior modeling.
especially in large and hard to reach zones. Pan et al. [6] proposed CA-BCO as an improved version of
The rest of this paper is organized as follows. We present the BCO algorithm, to solve the UAV route planning problem.
the related works in Section II. In Section III, we provide CA-BCO uses a probabilistic representation of the population
the details of our proposed multicast routing algorithm. Then, to replace the design variable of solutions search space of BCO
the performance of our proposed algorithm is evaluated and algorithm. CA-BCO provides a better performance within the
discussed in Section IV. Finally, Section V concludes the category of memory-saving algorithms [7]. However, It has
paper. the same drawbacks as BeeAdHoc because it uses the same
BCO algorithm [7].
II. R ELATED W ORKS
Efficient routing protocols are required for a successful com- B. Geographic-based
munication among the cooperating UAVs in a swarm. There This class of methods routing is based on the geograph-
are many routing protocols used in this class of networks, ical location of the members of the swarm geocast routing
and they can be classified into three mains categories: (i) protocols. In [7], the authors propose GeoUAVs designed for
Bioinspired-based, (ii) geographical location-based, and (iii) managing wildfire, especially in the zones hard to reach, which
multicast-based approaches. aims at delivering data to a specific group of mobile UAVs
identified by their geographical location to manage an active
A. Bioinspired-based solutions fire. It takes into account the mobility of nodes with 3D
Various algorithms based on the swarm intelligence can movement and manages to reduce the delays and maximize
be successfully adapted for cooperative UAVs, they fall into the throughput. However, it does not take into consideration
category of bioinspired algorithms [10], and they are usually power consumption.
classified into two categories: Ant colony optimization-based
approaches and Bee colony optimization-based approaches. C. Multicast-based
1) Ant Colony Optimization-based approaches: The ACO In this class of routing methods, a source UAV may need
algorithm is based on the social behavior of ants on the way to to send data to a specific group of UAVs hence the use
find the shortest path to the source of food [3, 4]. For instance, of group communication or multicast. The main advantage
in [3], the authors proposed a bio-inspired algorithm named of multicast routing is used the reduction in transmission
”APAR” to solve the communication problems in multi-UAVs overhead, in control message overhead, in power consumption,
systems. APAR integrates ACO algorithm with the well- and network partitioning [14]. In [8], the authors proposed
known DSR. APAR proposes to avoid the congestion and link SP-GMRF, which offers a mechanism to predict the nodes
breakage by establishing standards to choose routes based on current positions allowing it to rule out the nodes that are
sensing the distance of a route, the stability of a route and the within the communication range as possible next hop, then
congestion level of a route. However, its main drawback is the add one hop neighbors to the multicast tree that provide the
introduction of overhead and delays and prevent high mobility shortest distance to each of the destination nodes. However,
nodes from participating in route discovery. SP-GMRF suffers from the absence of power management.
AntHocNet is an ACO based routing protocol proposed In addition to the start of the data delivery from the source
in [4] to solve the problem of high mobility in FANET. after the tree construction is complete, which can cause link
AntHocNet is more suitable for large networks with high breakages especially when the time of tree construction is
mobility. However, it is less effective because off high costs increased. Another setback is the computation of the best next
for routing service information transfer. hop when the number of one hop neighbors or the number of
2) Bee colony optimization-based approaches: BCO is the destinations increases since the selection procedure requires
sub-class of a bio-inspired approaches taking their models the selection of the shortest distance between each pair of
from the bee behavior in natural habitat [11]. That is, the neighbor to destinations which can lead to numerous issues
bee hive operating principle is based on a clear distribution including power consumption, and disconnections.
of responsibilities among the bees. All bees of the hive can be DPTR [9] is another protocol for FANETs designed to han-
divided into three groups [12, 13]:employee bees, onlookers, dle transmission in collaborative ad-hoc networks. By adding
and scouts. BeeAdHoc [5], which one of BCO algorithms, certain rules to the formation of the Red-Black (R-B) trees, a
has two different stages during its functioning: (i) A scouting distributed priority tree is formulated. This tree forms a priority
stage during which forward and backward scouts including network that allows selection of an appropriate node and a
the source ID, the number of hops, and the minimal residual channel for relaying to avoid network fragmentation. Although
energy, are flooded across the network to establish multiple DPTR is scalable ad overcomes network fragmentation, it does
paths between the communicating nodes; and (ii) resource not support mobility and requires a considerable effort for
foraging stage during which the data packets are delivered management and control [5]. In addition it does not take into
from the source to the destination using the forager bees. consideration energy consumption.

183
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

The majority of the methods described above suffer from the The SNIR ratio requirement should be satisfied for each UAV
absence of the reliability and scalability [7]. One major issue to receive multicast messages. Thus, the received signal power
with these methods is being power-agnostic, which does not level at each drone should be above the minimum sensitivity
serve the fact that UAV are power-hungry devices especially level noted as PM inS . As specified in IEEE 802.11p, the mini-
with the additional constraint of flying in swarm formation. mum sensitivity level equals -68 dBm for the throughput of 27
Therefore, with SERMP, we propose to design a new multicast Mb/s, and the 5.9 GHz band is used for U2U communications.
routing protocol, which addresses the aforementioned issues. Thus, the transmitting power should hold:
We describe SERMP in details in the following section.  q 2
2 2 2
III. SEMRP: E NERGY- EFFICIENT M ULTICAST ROUTING PM inS  4π ∆x + ∆y + ∆z 
Pi ≥ (2)
P ROTOCOL FOR UAV S WARMS Gi Gj λ
In this section, we will be explaining the basic functioning
of our solution using two versions (SEMRP-v1 and SEMRP- Where ∆x = xi − xj , ∆y = yi − yj , and ∆z = zi − zj .
v2) with different perspectives, yet having the same objectives Our objective is to design an optimal tree for delivering
discussed in system model. multicast messages such that the total power consumption
can be maximally reduced, then the objective function can
A. System Model be expressed by:
To overcome the above mentioned challenges, we propose 
PM inS 4π 2 X

2
a new multicast approach in a swarm of UAVs in order to min ( ) · max D(u f ,uk )
(3)
Gi Gj λ k∈N (uf )
choose the shortest distance between UAVs, to optimize the uf ∈T
overall consumed energy consumption in the network, and to Our solution is based on designing an optimal multicast
expand the duration of labor. tree by using two different methods. The first method is
Our protocol aims at constructing a multicast tree to deliver SEMRP-v1, then secondly SEMRP-v2 where they both aim at
data from one source to the swarm members UAVs (U ). We choosing the nearest route to the multicast destination nodes
consider the following assumptions: and to switch to a closer route if found. This process helps
• We consider that each UAV is aware of its position in the to minimize transmission energy of forwarders and reduce the
3D environment with the help of a Positioning Service. number of hops in the tree.
The UAVs are equipped with cameras, sensors and other In the following sub-sections, we present the steps and
necessary equipment according to the application. working logic utilized by the both versions, each version is
• We assume that the transmission range of a drone can be described below for creating a multicast routing tree that spans
changed flexibly by adjusting its transmission power. The the multicast group UAVs members.
Maximum Transmission Range of each drone is restricted
by MTR, which represents the communication range that B. SEMRP-v1 Protocol
can be achieved using Maximum Transmission Power The construction of the multicast routing tree commences
(MTP). on demand-basis when a source has data to send to multiple
• To facilitate the movement of the drones and the delivery multicast destinations. The formal description of the tree
of data, we assume that the infected city area has a construction process is presented in Algorithm 1.
uniform shape rectangular or square for example. The SEMRP-v1 process is carried out in the following
In our model, we adopt the Friis’ power transmission phases:
formula, which is expressed as follows: (i) Initially, the multicast tree only contains the root which
 α is source us .
Pj λ
= Gi Gj (1) (ii) Then, us calculates the distance as highlighted in step
Pi 4πD(ui ,uj )
2 of Algorithm 1.
Where Pi is emission power of the transmitting drone (iii) The source us initiates the construction of the tree by
ui , Pj is the receiving power at the end of drone uj , Gi adding the node with the shortest distance to the tree and
and Gj are the antenna gains of the transmitter and the adjusting the transmission power to reach it, the necessary
receiver respectively, and λ is the wavelength. The parameter steps are highlighted in steps 3, 4, 5, and 6 of Algorithm 1.
α is typically in the range of 2 to 4, depending on the (iv) Next, us starts verifying its other one-hop neighbors
characteristics of the communication medium [15]. In FANET to potentially increase its transmission power by testing the
applications α is set 2. D(ui ,uj ) is the distance between drones condition shown in step 7 and following its sub-steps of
ui and uj . Algorithm 1.
When constructing a multicast tree from the source drone us (v) The source proceeds by notifying its added nodes to the
to all destination nodes, it is necessary to consider the Signal- tree so they can continue the construction of the tree as shown
to-Noise-and-Interference Ratio (SNIR) requirement for wire- in step 8 of Algorithm 1. The notified nodes that are in the
less communications. In the free-space propagation model, the tree act as the new source and repeat steps of the Algorithm
path loss is generally a function of the distance between UAVs. starting from Step 2 of Algorithm 1 along with updating their

184
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Algorithm 1: Multicast Construction Tree of SEMRP-v1 Procedure 1: ReturnToParent(int P arentId)


Input : us , Nu ; Input : int P arentId
Output : M RT Multicast Routing Tree rooted at us ; Output : void
Steps Steps
1. us adds itself to the tree T ; 1. if ParentId == S then
2. for each node up i ∈ N (us ) do 2. if NumberOfReturns == NumberOfBranches then
D(uS ,ui ) = (Xi − Xs )2 + (Yi − Ys )2 + (Zi − Zs )2 ; SearchF orRemainingN odes()
end end
3. Add the nearest neighbor ui of us with D(us ,ui ) to T ;
4. M ark us as the parent of ui ; end
5. us adjusts its transmission power P(us ,ui ) to reach ui ; else
6. Increase number of branches of us ; 3. if NumberOfReturns != NumberOfBranches then
7. for each node uj ∈ (N (ui ) ∪ N (us )) do 4.if NumberOfReturns == NumberOfBranches then
if (D(us ,uj ) − D(us ,ui ) ) < D(ui ,uj ) then ReturnT oP arent()
7.1 − Add uj to T and make us the parent of uj ; end
7.2 −
us adjusts its transmission power P(us ,uj ) to reach uj ; end
end
7.3 − Increase number of branches of us ;
end
else
7.4 − Add uj to T and make ui the parent of uj ; The isolated nodes are added to the tree, with associating the
7.5 − Sender node as the parent as shown in steps 2.a in Procedure
ui adjusts its transmission power P(ui ,uj ) to reach uj ; 2. While the other non isolated nodes choose the shortest
distance one hop tree member as the parent as shown in steps
7.6 − Increase number of branches of ui ;
2.b in Procedure 2. The parent nodes initiate adjusting their
end
end transmission power in function of the distances to the recently
8. N otif y the added nodes to the tree ; added child (Procedure 2: step 2.3a).
9. N otif y the rest of the nodes that are not yet in the tree After adjusting the transmission power of the parent,its
; neighbors that satisfy the condition in 2.4a in Procedure 2,
10. if Current Node is leaf then
change their parent node. In parallel, during the construction
return ToParent (ParentId) ;
end of the multicast tree, the source sends the data packets which
include the multicast destinations in the header, the packets are
forwarded in the tree until reaching the multicast destinations.
The functioning SEMRP-v1 is illustrated by Fig 1, where
neighbors distances from the source. The source also notifies the multicast tree initially contains only US (1.(a)). The
the neighbors that are not members of the tree, so they can source US starts constructing the tree by selecting the clos-
discover their new one-hop neighbors (Algorithm 1: step 9). est node which is UB , thus UB is added to the tree, US
(vi) After reaching a leaf node, the leaf node begins fi- adjusts its transmission power to reach UB and increases
nalizing the first phase of the multicast tree construction by its number of branches. Next, US calculates the incremental
returning to its parent node (Algorithm 1: step 10), the parent cost of UA , which is D(US ,UA ) − D(US ,UB ) = 1, and the
node waits for leaf returns from all its tree branches so it can selected forwarder UB calculates the distance to UA which
return to a superior parent level as highlighted in steps 3 and is D(US ,UB ) + D(UB ,UA ) > D(US ,UA ) therefore UA is added
4 of Procedure 1 , the procedure is repeated until reaching the to the tree which results US adjusting its transmission power to
initial source. reach UA directly and increases its number of branches again.
(vii) The source then begins the second phase of multicast The process is repeated as shown in figure 1.(b) : UA −→
tree construction which is searching for the remaining nodes UF , UA −→ UC , UA −→ UG , UC −→ UH , UH −→ UD ,
as highlighted in steps 1 and 2 of Procedure 1, in order to add and UD −→ UR .
them to the multicast tree. After reaching the leaf nodes which are UB , UR , UF , and
(viii) To ensure the optimal connectivity of the tree, the UG , the leaf nodes begin returning to their parent nodes until
remaining nodes are added to the tree with the shortest distance reaching US , for example UA waits for UC , UG , and UF
tree member as a parent. The source initiates searching for by comparing number of branches to number of received
non tree members by broadcasting a special message to its return messages so it can return to its parent node S. After
neighbors, tree members forward the message until it reaches all leafs return to the source US , US initiates broadcasting
nodes that are not yet in the tree as shown in step 1 in a message to search for nodes not included in the tree, the
Procedure 2. The non tree members could be either isolated message is forwarded by tree members in the network until
nodes that were not discovered during the first phase of tree reaching the targeted nodes. When the message reaches UE ,
construction, or nodes that were discovered, however did not it selects the closest neighbor tree member from the source
satisfy conditions in steps 3 and 7 of Algorithm 1. as its parent, UF −→ UE which is equal to 17 as shown in

185
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

UE UE UE
9 9 9
UV UV UV

10 10 10
15 15
UF 4 UF UF
2 2 2

3 3 3 3 3 3
6 UA UC UH 6 UA UC UH 6 UA UC UH
3 9 9
US 4 2 4 US 4 4 US 4 4
5 5 5
9
UB UG 2 UB UG 2 UB UG
UD UD UD
(a) UR (b) UR (c) UR

Step: 1 Step: 8 Step: 10

Fig. 1: An example of multicast tree construction

Procedure 2: SearchForRemainingNodes()
C. SEMRP-v2 Protocol
Output : void This algorithm is an improved version of SEMRP-v1, which
Steps bases on two-hop discovering. In the following, we explain the
1. if CurrentNode ∈ Tree T then main phases of the SEMRP-v2 algorithm.
SearchForRemainingNodes()
end
else Algorithm 2: Multicast Construction Tree of SEMRP-v2
2. if CurrentNode ∈ / Tree T then Input: us , Nu ;
if CurrentNode is isolated then Output : M RT Multicast Routing Tree rooted at us ;
2.1a Add CurrentNode to the tree Steps
2.2a Make Sender node the parent of node
1. us discovers its one hop neighbors; ;
CurrentNode
2. for each node up i ∈ N (us ) do
2.3a Adjust transmission power of the parent node
2.4a for each node ∈ N(Parent Node ’P’) do D(uS ,ui ) = (Xi − Xs )2 + (Yi − Ys )2 + (Zi − Zs )2 ;
2.5a if D(i,p) < D(CurrentN ode,p) then end
Make node P the parent of node i 3. for each node ui ∈ N (us ) do
end if ui is not notified yet then
1.Notify(ui ) // to launch their one-hop discovery
end
process ;
end
2.Mark as notified;
end
end
else
end
if CurrentNode is ¬ isolated then
4. if all ui ∈ N (us ) finish their one-hop neighboring process
2.1b Add CurrentNode to the tree
then
2.2b Make shortest distance node the parent of
1. if myID == us then
node CurrentNode
1. us : insert myself in T ;
2.3b Repeat steps 2.3a,2.4a ,2.5a
2. BuildTree(myID);
end
else
end
1. Set my status ready;
end
2. if I am marked as selected to be a forwarder then
BuildTree(myID);
end
end
figure 1.c. The next remaining node is UV , the closest node end
is UH −→ UV with D(UV ,UH ) = 22, therefore UH increases 5. if I receive a selection alert then
BuildTree(myID);
its transmission power to reach UV directly in function of its end
one hop distance which equals to 10, neighbors closer to UH 6. if I receive a forwarder finished alert then
than UV such as UR with one hop distance equals to 9 can if N (us ) != N T (us ) then
be reached directly from UH after increasing the transmission BuildTree(myID);
power, therefore UR ’s new parent is UH , as shown in figure end
end
1.(c). This step has the advantage of minimizing number of
messages and number of hops. After adding all nodes to our
multicast tree, the delivery of packets that started in parallel (i) In the first phase, the source us broadcasts a hello
with tree construction, continue until reaching the multicast message to discover its one-hop neighbors N(us ), which they
destinations. reply with their 3D positions to allow it to calculate the

186
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Function 1: BuildTree(int myID)


of an uf are in the tree, it sends an alert to its parent to allow it
Steps re-execute the third phase. The formal description of this tree
1. us = myID // Set the function caller as the current source construction process is presented in Algorithm 2 and Function
2. Add the nearest neighbor ui of us with D(us ,ui ) to T 1.
3. unearest = ui Figure 1.a illustrates the functioning of SEMRP-v2. Let
4. Set us as the parent of unearest US be the source drone. Initially, US inserts itself as the
5. Adjust the transmission power of us to reach unearest
6.while N (us ) != N T (us ) do
root of tree and adds the nearest neighbor which is UB as
1. int decision = decide(us ) showing in Fig:1.a. Then, the nearest neighbor of UB is UA ;
2. if decision < 0 then matrix[B][A].distance = 3. And the next near of UB is
// means: increment power to reach the node udecision UA ; matrix[S][A] = 6. Because US −→ UA through UB
1. int incremental= decision*(-1) is 8 and US −→ UA directly is 6, US chose to increment
2. Set us as the parent of uincremental
3.Increment the transmission power of us to reach
its power to reach UA directly. Then, US select UA as next
uincremental forwarder. At this level, when UA receives a selection alert
4. for each node ui ∈ N T (us ) do it finds itself ready to execute the BuildTree procedure. UA
if (D(us ,ui ) < inserts its nearest neighbor which is UF . Then, UA decides
D(us ,uincremental ) )and(Hops(us ,ui ) < to increment its power to reach UC then UG . After that, UA
Hops(ui ,parent(ui )) ) then
permits UC to be its next forwarder. And, the node UC repeats
1. Update the parent of ui to us
end the same process as US and UA as shown in Fig: 1.b: UC −→
end UH , UH −→ UD , UD −→ UR .
else When increasing the power of a node, we will verify
// means: select the node udecision as the forwarder of whether its neighbor that is already in the tree can be covered
us
2. uf = udecision by this node. If this neighbor can be covered and the hop-
1. if uf is ready then counts to the source drone US can be reduced, we will adjust
1. Send a selection alert to uf the delivery path for this neighbor and change its parent to
else the current forwarder node. As shown in (Fig:1.c), when UH
1. Mark uf as selected as a forwarder increases its power to reach UV it could reach UR with less
end
end number of hops to the source so it becomes its parent rather
end than UD . The Data delivery phase could be started in parallel
7. if N (us ) == N T (us ) then with the tree construction process or after adding all nodes to
if I am not the source node then the resulted tree (Fig:1. c).
1. send a forwarder finished alert to my parent
end IV. P ERFORMANCE E VALUATION
end To evaluate our proposal performance, we implemented its
different phases in NS-2 simulator. In addition, we compare
our algorithms with the state-of-the-art routing protocol SP-
distances. Then, the source notifies all its one-hop neighbors GMRF [8] through two different scenarios. The performances
to launch their one-hop discovery process. are recorded when using Nomadic Community mobility model
(ii) Then, the source us adds its nearest neighbor(s) to the and gradually increasing drones density. The number of UAVs
tree and adjust its transmission power to reach it. ranges from 10 to 40. The physical and MAC layer protocol
(iii) In the third phase, us should decide which new node used in this simulation is the 802.11p protocol. The rest of
will be added to the tree with the minimum power (i.e two scenarios are described below:
minimum distance). Here, two alternatives are distinguished: 1) In the first scenario, we compare our algorithms using the
the source us should either increase its power to reach another same simulation parameters as SP-GMRF [8]. The speed
node(s) (this include an adjustment in its transmission power), of UAVs varies from 5 m/s to 20 m/s, the network size
or select a forwarder node uf . The selection of the best of 1000m x 1000m, the number of destination nodes is 5,
forwarder is based on comparing the distances to the next the maximum transmission range of UAVs is set to 250
closest neighbors of the N T (ui ) inserted by the source , and m, and the simulation time is 100 seconds.
choosing the minimum one. In this case, the source marks uf 2) In the second scenario, we use a network of 2500m x
as its best forwarder and send a selection alert to it. This third 2500m, the number of destination nodes varies from 5 to
phase should be repeated until all nodes are included in the 20, the speed of UAVs varies from 10m/s to 35m/s, the
tree. maximum transmission range is set to 350m, and duration
After that, if a node ui receives a selection alert, it verifies of simulation is 300 seconds. Simulation parameters are
if its status is set as ready (complet its one hop discovery and summarized in Table I.
notify all its neighbors), then it will execute the above three The resulted parameters leads to the conclusion that routing
phases in which it plays the role of the source. Otherwise, it protocol performs better or otherwise worse in a specific
mark itself as selected to be a forwarder. When all neighbors scenario. The parameters are the following:

187
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE I: Simulation configurations 300


SP-GMRF [1st scenario]
Parameter Value: 1st scenario Value: 2nd scenario SEMRP-v1 [1st scenario]
Mobility model Nomadic community Nomadic community SEMRP-v2 [1st scenario]
Number of UAVs 10, 20, 30, 40 10, 20, 30, 40 SP-GMRF [2nd scenario]
Velocity [5-20] m/s [10-35] m/s

End-to-End delay (ms)


200 SEMRP-v1 [2nd scenario]
Simulated area 1000 m x 1000 m 2500 m x 2500 m
SEMRP-v2 [2nd scenario]
Destination nodes 5 5, 10, 15, 20
Transmission range 250 m 350 m
Simulation time 100 seconds 300 seconds
MAC layer protocol 802.11p 802.11p
Packet size 128 bytes 128 bytes 100
Gi , Gj 1 1
PM inS 10−9.8 10−9.8
α 2 2
1 1
λ 5.9·109 5.9·109
0
10 15 20 25 30 35 40
100 Number of UAVs

Fig. 3: End-to-End delay vs. UAVs density


80
Packets Delivery Ratio %

60 receive data packets while the tree construction phase is still


SP-GMRF [1st scenario] in due. Nonetheless, it is noted that packet delivery rate of
SEMRP-v1 [1st scenario]
SP-GMRF is decreased to 75%. This can be explained by the
SEMRP-v2 [1st scenario]
40 increase of time to construct the multicast tree when UAVs
SP-GMRF [2nd scenario]

SEMRP-v1 [2nd scenario]


density is increased. Furthermore, increasing UAVs density
SEMRP-v2 [2nd scenario] potentially leads to having more number of one hop neighbors
20
within the communication range, which then results more
10 15 20 25 30 35 40
computation to select the best next hop since it relies on
Number of UAVs
both number of neighbors and destinations. It also means it
Fig. 2: Packets Delivery Ratio vs. UAVs density is frequent for a child node to exit the communication range
during the data delivery phase.
In the second scenario, we also observe that SEMRP-v1 and
• Packets Delivery Ratio: It is defined as the ratio of SEMRP-v2 have a packet delivery rate up to 22% larger than
number of received packets successfully at the destination SP-GMRF and both versions kept a high PDR that decreases
nodes over the number of packets transmitted by the to 94,33% and 90% respectively comparing to SP-GMRF that
source nodes. is reduced to 50%. The decrease of PDR amount in the second
• Average End-to-End Delay: It is the average time from scenario can be explained with links breakage resulted from
the sending of a packet at a source node until packet high mobility of the nodes and packet loss.
delivery to a destination node. Figure 3 shows End-to-End Delay of each transmitted
• Total Emission Energy (in dBm): It signifies the global data packets during the simulation time as a function of
transmission energy in a network. We transform the total UAVs density. From the curves, we observe that SEMRP-
energy obtained from equation (3) from Watt unit to v1 and SEMRP-v2 significantly reduce the average delay in
regular dBm expression as follows: comparison of SP-GMRF in both scenarios.
In the first scenario, we noticed that both SEMRP-v1 and
PM inS 4π 2 X
  SEMRP-v2 converged to a reduced End-to-End Delay (at least
2
10log ( ) · max D(uf ,uk )
+30 (4) by 59 ms). The reason for this is the optimal route selection
Gi Gj λ k∈N (uf )
uf ∈T by reducing the number of hops in the tree. Whereas, in SP-
Figure 2 represents the PDR of SEMRP-v1, SEMRP-v2, GMRF we noticed a continuous increase in the End-to-End
and SP-GMRF as a function of UAVs density. The curves Delay, which is a result of the increase in number of hops and
show that SEMRP-v1 and SEMRP-v2 perform better than SP- the computation of links in tree construction phase.
GMRF when it comes to PDR for all network densities in both We also observe in the second scenario that SEMRP-v1 and
scenarios. SEMRP-v2 significantly reduce End-to-End Delay (at least
In the first scenario, both SEMRP-v1 and SEMRP-v2 reach by 39 ms) compared to SP-GMRF. It is also worth noting
100% of destination nodes in all network densities. The reason that End-to-End delay increases while increasing velocity and
for this packet delivery rate is the absence of disconnections, destination nodes density. This is mainly due to the increase of
which allows destination nodes to receive the packets suc- the number of hops in SEMRP-v1 and SEMRP-v2, in addition
cessfully. Therefore, destination nodes that are tree members to the high mobility of the nodes which causes packet re-

188
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

[2] H. Zhao, H. Wang, W. Wu, and J. Wei, “Deployment algorithms for


320 uav airborne networks toward on-demand coverage,” IEEE Journal on
Selected Areas in Communications, vol. 36, no. 9, pp. 2015–2031, 2018.
[3] Y. Yu, L. Ru, W. Chi, Y. Liu, Q. Yu, and K. Fang, “Ant colony
Total Emission Energy (DBm)

optimization based polymorphism-aware routing algorithm for ad hoc


300 uav network,” Multimedia Tools and Applications, vol. 75, pp. 1 – 26,
2016.
[4] V. A. Maistrenko, L. V. Alexey, and V. A. Danil, “Experimental estimate
SP-GMRF [1st scenario]
of using the ant colony optimization algorithm to solve the routing
280 problem in fanet,” in 2016 International Siberian Conference on Control
SEMRP-v1 [1st scenario]
and Communications (SIBCON), 2016, pp. 1–10.
SEMRP-v2 [1st scenario]
[5] O. S. Oubbati, M. Atiquzzaman, P. Lorenz, M. H. Tareque, and M. S.
SP-GMRF [2nd scenario]
Hossain, “Routing in flying ad hoc networks: Survey, constraints, and
SEMRP-v1 [2nd scenario]
260 future challenge perspectives,” IEEE Access, vol. 7, pp. 81 057–81 105,
SEMRP-v2 [2nd scenario] 2019.
[6] T.-S. Pan, D. Kien, J.-S. Pan, and T.-T. Nguyen, An Unmanned Aerial
10 15 20 25 30 35 40 Vehicle Optimal Route Planning Based on Compact Artificial Bee
Number of UAVs Colony, 11 2017, pp. 361–369.
[7] F. Z. Bousbaa, C. A. Kerrache, Z. Mahi, A. E. K. Tahari, N. Lagraa,
Fig. 4: Total Emission Energy vs. UAVs density and M. B. Yagoubi, “Geouavs: A new geocast routing protocol for fleet
of uavs,” Computer Communications, vol. 149, pp. 259 – 269, 2020.
[8] H. Redwan, S. Choi, J.-H. Park, and J. Kim, “Predictive geographic
multicast routing protocol in flying ad hoc networks,” International
Journal of Distributed Sensor Networks, vol. 15, p. 155014771984387,
transmissions. Whereas, the increase in SP-GMRF is a result 07 2019.
for not being able to find the shortest path to the destination [9] V. Sharma, R. Kumar, and N. Kumar, “Dptr: Distributed priority tree-
based routing protocol for fanets,” Computer Communications, vol. 122,
during the tree construction phase. In combination with nodes pp. 129–151, 2018.
mobility, it causes packet loss in the packets delivery phase [10] S. Fidanova and K. Atanassov, “Flying ant colony optimization algo-
and re-transmissions. rithm for combinatorial optimization,” Studia Informatica, vol. 38, no. 4,
pp. 31–40, 2017.
Figure 4 illustrates the Total Emission Energy for deliv- [11] A. V. Leonov, “Application of bee colony algorithm for fanet rout-
ery data packets in function of UAVs density. It is noted ing,” in 2016 17th International Conference of Young Specialists on
that SEMRP-v1 and SEMRP-v2 protocols provide less Total Micro/Nanotechnologies and Electron Devices (EDM), June 2016, pp.
124–132.
Emission Energy in the network for both scenarios (at least [12] D. Karaboga, “An idea based on honey bee swarm for numerical opti-
by 10 dBm) compared to SP-GMRF. Thus, our solution mization, technical report - tr06,” Technical Report, Erciyes University,
achieves better results for delivering multicast messages in 01 2005.
[13] D. Pham, A. Ghanbarzadeh, E. Koç, S. Otri, S. Rahim, and M. Zaidi,
swarm of UAVs because of our packets forwarding strategy “The bees algorithm - a novel tool for complex optimisation problems,”
and the efficiency of adjustment of transmission power of the Proceedings of IPROMS 2006 Conference, 12 2006.
forwarders, which also reduces the number of hops. Thus the [14] D. J. Akbari Torkestani and M. Meybodi, “Multicast routing protocols
in manet,” Technological Advancements and Applications in Mobile Ad-
Total Emission Energy is accordingly reduced, whereas in SP- Hoc Networks: Research Trends, 01 2012.
GMRF the resulted Total Emission Energy can be explained [15] F. Z. Bousbaa, N. Lagraa, C. A. Kerrache, F. Zhou, M. B. Yagoubi, and
with the choice of longer routes and increased number of hops. R. Hussain, “A distributed time-limited multicast algorithm for vanets
using incremental power strategy,” Computer Networks, vol. 145, pp.
V. C ONCLUSION 141–155, 2018.

In this paper, we studied the use of multicast routing in


swarm of UAVs. Afterwards, a reliable multicast routing pro-
tocol using minimum energy is proposed. Simulation results
demonstrated that our solution outperforms existing solutions
like SP-GMRF in terms of average delay and total emission
energy. Obtained results also confirmed that our proposal is
able to provide more reliable and scalable data delivery for all
swarm members. For the future work, we plan to investigate
the use of cooperative UAVs swarm interacting with both
the ground and the edge. In the first case, the swarm UAVs
operate as mobile service providers to the ground users, hence,
enabling applications like navigation support information in
intelligent transportation systems. In the second case, we plan
to explore data offloading problem for UAVs swarm at the
edge level.
R EFERENCES
[1] D. Shumeye Lakew, U. Sa’ad, N. Dao, W. Na, and S. Cho, “Routing
in flying ad hoc networks: A comprehensive survey,” IEEE Communi-
cations Surveys Tutorials, vol. 22, no. 2, pp. 1071–1120, 2020.

189
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

190
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

BASE
1 2 j N

191
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

192
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

case A case B case C

0 1

193
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

A priori Sums/Subs
3 A priori Mults/Divs
10
ISAAC Sums/Subs
ISAAC Mults/Divs
Operations [#]

102

101
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
BSs [#]

194
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

195
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε ε

ε ε ε ε ε ε ε ε

150 150
Uploaded data [GB]
Acquired data [GB]

100 100

50 50

0 0
1 2 3 4 5 6 7 1 2 3 4 5 6 7
BSs [#] BSs [#]

ε ε ε ε ε ε ε ε

196
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

400
50

Uploaded data [GB]


Acquired data [GB] 300 40

30
200
20
100
10

0 0
1 2 3 4 5 6 7 1 2 3 4 5 6 7
BSs [#] BSs [#]

197
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Simulation and Digital Twin Support


for Managed Drone Applications
Nasos Grigoropoulos and Spyros Lalis
University of Thessaly
Volos, Greece
Email: {athgrigo,lalis}@uth.gr

Abstract—As drone technology passes one milestone after of the Platform-as-a-Service (PaaS) paradigm, which is already
the other, drones are used in an ever-increasing number of very popular in Cloud computing. While such efforts are
applications and are now considered as an integral part of the still in their infancy, this approach has significant advantages
future smart city infrastructure. At the same time, the inherent
safety and privacy risks associated with drone-based applications compared to private drone ownership and operation.
call for appropriate testing and monitoring tools. In this paper, However, the upcoming coexistence of multiple drones
we present a simulation environment and digital twin support flying over citizens and private properties raises several safety
for a platform that allows the managed execution of drone-based and privacy issues due to the possible crashes, collisions, and
applications on top of a shared drone infrastructure. On the one uncontrolled usage of the drone’s onboard equipment [2], [3].
hand, the simulation environment makes it possible to perform a
wide range of tests regarding the operation of both the platform While most countries have formal processes for submitting
itself and the applications that run on top of it, before deploying flight plans and getting approval, in many cases there are
them in the real world. On the other hand, after deployment, no mechanisms for monitoring drone operation and, more
a digital twin of the drone is used to detect deviations of the crucially, for ensuring that the approved flight plan is followed.
application from the expected behavior, which, in turn, can serve Also, most efforts towards low altitude airspace manage-
as an indication of bugs that remained undetected during the
simulation tests or malfunctions that occur at runtime. We discuss ment [4], [5] focus on where drones are allowed to fly, rather
the most important elements of our approach and the simulation than on how their onboard equipment is used during flight.
and digital twin components of the proposed system. Also, we Inevitably, this leads to skepticism and limited public ac-
provide a functional evaluation of our work by presenting its ceptance of drone-based systems, even to extreme reactions by
capabilities regarding both offline testing and runtime checking people opposed to drone usage [6]. To gain the citizen’s trust,
through indicative use cases.
Keywords—drones, simulation environment, digital twin, PaaS systems have to be engineered to address safety and privacy
issues by design, through suitable mechanisms that can be
easily integrated and used to detect bugs and malfunctions.
I. I NTRODUCTION In this paper, we propose a holistic approach towards
Drones are evolving into a major component of the next- supporting a more reliable managed operation of third-party
generation smart city initiatives, which aim at enhancing the applications on a shared drone infrastructure, which can con-
life of their residents through efficient infrastructures and ser- tribute to building trust between the various stakeholders and
vices [1]. For example, delivery drones expedite the transporta- making drones more acceptable to the wider public. The main
tion of goods and medical products with minimal human in- contributions of the paper are: (i) we present a modular archi-
volvement while avoiding traffic-related delays. Furthermore, tecture combining a PaaS system for drone applications, which
drones are a valuable asset for private and public property offers automated deployment and restriction enforcement, with
surveillance and security operations offering better coverage corresponding simulation and digital twin support that can be
and rapid response to critical situations. The construction used to detect bugs before deployment and to indicate possible
industry has also started using drones for the aerial inspection malfunctions during operation in the real world, respectively;
of buildings and infrastructure in order to improve worker (ii) we discuss key aspects of a prototype implementation;
safety and increase the efficiency of scheduling operations. (iii) we showcase how the proposed work can be used in
Even though the prices of commercial off-the-self drones are practice through representative case studies.
steadily decreasing, the cost of specialized drones built with The rest of the paper is organized as follows. Section II
durable materials or carrying high-end equipment, combined describes the PaaS system that constitutes the baseline for
with licensing and insurance fees, remains significant. As a this work. Section III presents the design of the simulation
consequence, such drones typically remain unapproachable for environment and digital twin support we propose for this PaaS
most small and medium-sized companies. An alternative is to system, while Section IV discusses the main aspects of our
use drones and drone-related resources on-demand, in the spirit implementation. Section V illustrates the simulation and digital
twin functionality for an indicative test application under
different execution scenarios. Section VI gives an overview
978-1-7281-7343-6/20/$31.00 ©2020 IEEE of related work. Finally, Section VII concludes the paper.

198
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

II. PAA S FOR D RONE A PPLICATIONS


Our high-level goal is to enable a safer integration of
drone applications in the urban environment, through suitable
system support. This section briefly introduces a Platform-as-
a-Service (PaaS) system for drone-based applications, which
is the baseline for the work presented in this paper.
The PaaS paradigm is recently being investigated as an
attractive option beyond the Cloud, also for drone-based
applications [7], [8], [9]. In this spirit, we have proposed and
developed a PaaS-like system, which supports the managed
deployment and execution of applications on top of a shared
drone infrastructure [10]. Our platform has two distinctive
features. It allows application code to run directly on the
drone, along the lines of Fog/Edge computing [11]. This makes
it possible to implement application-level sensing-processing-
actuation loops with minimal latency and avoids sending large
amounts of raw data to the Cloud. Furthermore, the platform
enforces restrictions that are defined by authorities for the
application’s mission/flight plan, even if the application that
runs on the drone attempts to violate them at runtime. Fig. 1: Architecture of PaaS for managed deployment and
Figure 1 illustrates the high-level architecture of the plat- controlled execution of drone-based applications on top of a
form. Briefly, drone applications, developed by qualified and shared drone infrastructure.
certified third parties, are registered with the system so that
they can be invoked by different clients. Each application is an
detect unexpected deviations and potential malfunctions of
independently deployable and runnable software component,
the drone at runtime. To this end, we have developed a
which is stored in the system’s application repository. In turn,
complete simulation environment, which makes it possible
the drones that are part of the infrastructure serve as appli-
to perform a wide range of tests regarding the operation of
cation hosts. The sensor/actuator, computing/communication
both the platform itself and the applications that run on top
and flight-related requirements of the application, and the
of it. Furthermore, based on this simulation environment, we
available drone resources and flight capabilities, are captured
have developed digital twin support for the drone, which
via corresponding descriptors. Further, the different flight- and
makes it possible to detect, at runtime, large deviations of
privacy-related restrictions posed by the authorities are also
the application execution from the expected behavior. The
captured in an explicit way through suitable descriptors.
The Management Controller (Controller for short) resides following subsections describe our approach in more detail.
in the Cloud and provides services for managing all client A. Overview
requests (e.g., for starting the execution of an application),
allocating drones to the applications that need to be executed, A high-level view of our framework is shown in Figure 2.
generating a corresponding flight plan based on the current The basic simulation entity is the so-called virtual drone
restrictions, and deploying the application and flight plan on (v-drone). It represents a simulated drone that can run the
the drone. Every drone features a trusted Drone Runtime complete software stack of a real drone, including the autopilot
environment, which is used to execute the application while and the Drone Runtime of the PaaS platform. The v-drone can
providing the necessary isolation for critical drone resources be configured to work in two different modes: pure simulation
and subsystems, including the autopilot. Also, the Drone mode and digital twin mode. Also, there is a virtual Controller
Runtime enforces the restrictions associated with the running entity (v-Controller), which is a mockup implementation of
application and current flight plan, via mechanisms that detect the real Controller of the PaaS platform, through which it is
and handle application-level violations. During application possible to issue commands to and receive status updates from
execution, the Drone Runtime sends periodic status updates v-drones. Simulation scenarios may involve multiple v-drones,
to the Controller. Conversely, the Controller may issue com- which can communicate with the v-Controller and/or with
mands and updates to the Drone Runtime for the restrictions each other through one or more simulated wireless channels.
that apply to the specific application and flight plan. Depending on the mode of operation, virtual entities (v-drone
and v-Controller) may co-exist and run in tandem with real
III. S IMULATION AND D IGITAL T WIN S UPPORT system entities (drone and Controller).
The platform described in Section II is an important step Real drones, v-drones and the v-Controller all feature an
towards our vision. However, additional support is required agent component. The agent exposes an interface that is used
so that developers can test both the platform and the ap- by the Test Orchestrator in order to properly configure the
plication before deployment. Furthermore, it is important to participating entities according to the needs of the specific test

199
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 2: High-level view of the framework providing simulation


and digital twin support for the PaaS platform.

mode and objectives. Also, the agent is responsible for creating


Fig. 3: Simulated setup with HITL and SITL configurations.
logs where different runtime events are recorded. These logs
are sent to other agents or they are stored in a central repository
for further processing by the Results Analyzer in order to configurable allowing the user to choose the setup that is most
confirm expected behavior or detect deviations. appropriate for the specific testing objectives.
B. HITL and SITL Configurations of v-drones C. Offline Platform Testing
Platform and application testing can be performed at low Using these simulation configurations, it is possible to test
cost and with zero risk using a simulated setup that consists different aspects of the platform functionality. For this purpose,
of a v-Controller and one or more v-drones. In this case, one typically designs suitable test applications. On the one
the v-drone can be set to operate in a hardware-in-the-loop hand, the resource allocation, application deployment and ap-
(HITL) or software-in-the-loop (SITL) configuration. Figure 3 plication execution functionality can be tested using “benign”
illustrates the most typical simulation options (for brevity, the applications that implement different types of missions (the
agent components are not shown). PaaS supports specific templates). On the other hand, “ma-
In the HITL configuration, the Drone Runtime environment licious” applications, which intentionally attempt to perform
and the application both run on the hardware platform that is actions that violate the flight plan and related restrictions,
actually used in the real drone. The autopilot may also run on can be used to confirm proper operation of the platform’s
this hardware if it supports a HITL mode. Otherwise, it runs monitoring and violation detection/handling mechanisms.
as a separate simulation entity (v-Autopilot) on the computing It is also possible to run different scenarios aimed at testing
infrastructure of the testing facility, in which case it interacts exceptional and critical situations. This can be achieved by
with the rest of the drone software stack running on the target instructing the v-Controller to issue commands to the drone,
drone hardware through a proxy. Finally, it is possible to have a which change the flight plan, update restrictions, or even
pure SITL configuration where the entire drone software stack request an emergency landing, and then check that the platform
(autopilot, Drone Runtime, and application) runs on virtual behaves accordingly. Furthermore, one can experiment with
hardware on the computing infrastructure. intermittent connectivity and disconnected operation scenarios,
The autopilot communicates with a simulator that imple- by disrupting the (simulated) wireless communication between
ments the flight dynamics model for the drone. It provides the v-drone and the v-Controller and observing whether the
artificial sensor data for the accelerometers, gyroscopes, mag- corresponding fail-safe actions are triggered.
netometers, and barometers, which normally come from the
drone’s inertial measurement units (IMUs) during real opera- D. Offline Application Testing
tion. This sensor data is used by the autopilot to estimate the Once the platform has been tested to a satisfactory degree,
position, speed, and acceleration of the drone, based on which the same simulation environment can be used to test and debug
control decisions are taken and corresponding commands are concrete applications in a practically open-ended fashion.
issued to the drone’s actuators (e.g., motors). In the HITL The usual objective of application testing is to verify that the
configuration, the flight dynamics simulator runs as a separate application always behaves as expected. This can be achieved
entity, which is connected over serial or a fast network connec- by designing and running a wide range of test scenarios, which
tion to the hardware where the drone software stack is running. combine different application configuration parameters, flight
In the SITL configuration, there is also the option of running plans and restrictions. After each test run, the outcome that is
the flight dynamics simulator together with the autopilot as a recorded in the event logs can be compared to the expected
single simulation entity. The simulation framework is flexibly application behavior.

200
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

the Controller commands towards the Drone Runtime, as well


as the telemetry logs that contain the autopilot’s replies to
received requests along with status and flight-related data pub-
lished by the autopilot periodically. Each piece of information
is timestamped using the drone’s system clock. In addition,
the requests of the application and the corresponding replies
of the autopilot are tagged with increasing sequence numbers.
In the v-drone, the replay engine uses this information to
drive the local execution. More specifically, the v-drone starts
application execution with a pre-specified delay defined by
the user (operator). The startup delay must be large enough to
ensure a continuous flow of information from the real drone to
the v-drone assuming a worst-case data transfer delay and jitter
over the wireless connection, but also short enough to enable
timely identification of execution inconsistencies. A startup
delay in the order of a few seconds is typically sufficient
Fig. 4: DT/SITL v-drone configuration.
for this purpose. Once application execution starts, the replay
engine continuously checks the received information for the
Another typical objective of application testing is to con- chronologically next event, with a timestamp that is equal to
firm that the application does not consume more computing the local clock value minus the startup delay. Autopilot status
and communication resources than declared by the developer messages are sent through the mockup, whereas Controller
(as specified in the corresponding description). This can be commands are sent to the Drone Runtime.
achieved by properly configuring the resources that are allo- The agent component of the digital twin intercepts appli-
cated to the application on the v-drone as well as by observing cation requests, retrieves their type and passed parameters,
the actual resource consumption during execution. and compares them with the ones that were issued by the
For such tests it typically suffices to use a pure SITL application running on the real drone. Obviously, requests
configuration of the v-drone. Of course, it is also possible to with the same sequence number should be identical. If so, the
use a HITL configuration if it is desired to run the application agent forwards the application request to the Drone Runtime as
(and the rest of the drone software stack) on the target usual, which, in turn, issues a request to the autopilot mockup
hardware platform used in the real drone. As discussed, the that replies using the corresponding information supplied by
flight dynamics simulator can be integrated with the autopilot the replay engine. Else, if the application request on the digital
or run separately, depending on the degree of fidelity required. twin differs from the one that was generated on the real drone,
which may be an indication of a possible malfunction, an alert
E. Digital Twin Setup for Application Checking at Runtime is raised and the digital twin terminates.
Despite elaborate offline testing, hidden hardware malfunc- Note that this setup also makes the real application state
tions or silent data corruption may still occur at runtime. It is easily accessible to testers, for instance, to inspect several
thus highly desirable to be able to confirm that everything runs performance indicators in depth when exploring optimizations.
as expected during the real mission, ideally without involving This can be done efficiently by accessing the internals of the
a human observer. digital twin in a direct way, instead of having to access the real
This is achieved using a digital twin (DT) approach, a drone over the wireless/mobile network, which can introduce
popular way of testing cyber-physical systems. More specifi- a large delay but also consume valuable resources that should
cally, next to the drone that runs the application in the real be reserved for more critical control operations.
world, the Drone Runtime and application also run on a
v-drone. The v-drone is configured to operate in a special IV. I MPLEMENTATION
DT/SITL configuration, where the autopilot is replaced by The design of the simulation framework is based on
a mockup and where the execution of the application and AeroLoop [12], a modular system for experimentation with
Drone Runtime occurs in conjunction with a replay engine virtual drones designed to run on off-the-shelf computing in-
that receives information from the real drone. Figure 4 shows frastructure. The v-Controller and the v-drone in the SITL and
a high-level illustration of the approach. DT/SITL configurations are packaged as separate virtualized
In the real drone, the agent component continuously records systems. Instead of virtual machines (VMs) used in the vanilla
state-related information and streams it to the v-drone acting AeroLoop system, in this case, we use Linux Containers
as its digital twin. This information includes the application (LXDs). LXDs offer a good compromise between isolation
startup event, the application requests that are received by and resource efficiency, as they provide operating system-level
the Drone Runtime (e.g., arm, takeoff, navigation or camera virtualization, which is more lightweight than full-fledged
control commands), the corresponding (potentially adjusted) VMs, while offering a more complete virtual system (closer
requests forwarded by the Drone Runtime to the autopilot, to a real drone) than Docker containers that share the same

201
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

networking/storage stacks. For the HITL configuration of the can be configured to use an integrated flight dynamics model
v-drone, we support the Raspberry Pi 3 platform, which is the or be connected with an external, higher fidelity simulator.
most popular companion board used in real drones. ArduPilot has built-in support for the former option, while for
The Test Orchestrator and Results Analyzer are imple- the latter option we currently support Gazebo [22] through
mented as Python libraries. The test scenarios are Python the corresponding plugin. Note that any other simulator that
scripts that simply invoke these libraries. The agents are supports ArduPilot (e.g., AirSim [23]) could be used instead.
also implemented as Python programs, running as standalone Finally, the simulation environment provides two options for
processes with the respective interface being invoked via RPCs simulated camera support to the application. When running the
through the zerorpc library [13]. For logging, we utilize the autopilot together with Gazebo, the ROS camera plugin can
standard Python logging facility, and event logs are processed be utilized to publish the virtual camera stream of Gazebo to a
by the Results Analyzer using standard Unix tools. ROS topic. Alternatively, we provide a custom virtual camera
The Test Orchestrator creates and configures all simulation module for taking snapshots, which mimics the API of the
entities through the respective agents. It is also responsible Raspberry Pi camera module and returns images automatically
for performing all the network configuration to enable the extracted from a pre-configured database (see [12]).
communication between the entities involved in a given test.
V. F UNCTIONAL E VALUATION
In simulation-based configurations, wireless networking is im-
plemented using ns-3 [14]. For Wi-Fi channels, each simulated We illustrate the main aspects of the provided functionality
ns-3 node, called ghost node, utilizes the ns-3 TapBridge through indicative scenarios focusing on flight-related behav-
device, which is connected to each v-drone through a combina- ior. The application we use for this purpose is intentionally
tion of network bridges and virtual network devices (see [12]). kept simple. It is a Python program that arms the drone, takes
We also provide support for simulated LTE interfaces. This is off to WP1 (at a height of 10 meters), goes to waypoint WP2,
achieved by introducing a high-bandwidth, low-latency CSMA next moves to waypoint WP3, returns to initial location WP1,
link between a ghost node and the simulated LTE’s UE node and lands. This is done by issuing corresponding commands
(see [15]). Finally, the v-drone agents continuously update the to the autopilot via DroneKit.
position of the respective ns-3 ghost nodes through a pub/sub
A. Offline Validation
scheme that is implemented using the ZeroMQ library [16]. In
the DT/SITL configuration, the Test Orchestrator instructs the Listing 1 shows the setup sequence for an offline test using
agents of the real and the v-drone to setup the physical con- a v-drone and a v-Controller that communicate via Wi-Fi.
nection that will be used to transfer the required information 1 controller = vController()
from the real drone to its digital twin. 2 drone = vDrone("vDrone-1", SITL,
3 autopilot=Ardupilot,
Inside v-drones, the application is executed as a Docker 4 setup=config.min_resources)
container (this is how applications are packaged in the PaaS 5 # drone = vDrone("vDrone-1", SITL, fdm=gazebo)
system). To this end, for v-drones running as LXDs we 6 drone.set_app("waypoint_app",
7 params=config.app_params)
exploit the nested container functionality. Applications may 8 drone.set_plan(config.flight_plan)
perform different navigation operations, access sensors and 9 drone.set_pos(config.home_pos)
issue actuation commands via the autopilot subsystem, using 10 network = NetSim(config.net_wifi)
11 network.add_participants(controller, drone)
the MAVLink messaging protocol [17]. Various APIs offer 12 drone.set_logs(app, runtime, autopilot)
MAVLink support, from low-level C and python (pymavlink) 13 controller.set_logs(runtime)
libraries to higher-level ones like DroneKit [18] and ROS [19] 14 drone.start_app()
through the MAVROS communication node. Listing 1: Setup and start offline test.
The autopilot used in v-drones is the latest stable version
of ArduPilot [20], which is one of the most widely adopted In a nutshell the steps are to: create a v-Controller entity
autopilot stacks supporting a wide variety of aerial vehicles. (line 1); instantiate a v-drone with identifier “vDrone-1”,
In the HITL and SITL configurations, we use the pre-built which is set in SITL mode using ArduPilot, with system
binaries for the ARM and x86 64 architectures, respectively. resources (in memory and disk) as indicated in a configu-
The autopilot proxy in the hybrid HITL/SITL v-drone configu- ration file (lines 2-4); set the waypoint application as a pre-
ration is implemented using the MAVProxy [21] multiplexing installed application at the v-drone (as opposed to deploying
tool, which forwards all messages of the Drone Runtime to the it dynamically via the v-Controller) and specify its parameters
remote autopilot (running in SITL), and vice versa. Note that (takeoff altitude and waypoints) (lines 6-7); set the approved
any autopilot that supports MAVLink and provides support flight plan, which also specifies restrictions and the respective
for HITL or SITL simulation, such as PX4, could be easily corrective actions (line 8); set the start location / home position
integrated in our framework. The replay engine used in the of the v-drone (line 9); create a Wi-Fi network (lines 10-11);
DT/SITL configuration is implemented as a Python module, activate logging at all available levels (lines 12-13); and finally
while the autopilot mockup utilizes the pymavlink library for start the application (line 14).
receiving and sending MAVLink messages. The option of specifying the usage of an external flight
As mentioned in Section III, the autopilot of a v-drone dynamics simulator in the SITL setup (in this case, Gazebo)

202
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Fig. 6: Hexacopter used for the runtime checking.

completion (line 4). In the latter case, checks directly examine


the application status or the existence of a violation of a
Fig. 5: Offline testing experiment. specific type (lines 5-6).
1 info = get_violations("vDrone-1")
2 data = get_data_sent("vDrone-1", level=app)
is shown in line 5, in the form of a comment. In the specific 3 perf = get_app_resources("vDrone-1")
4 compare(perf, config.expected_perf)
scenario, wireless communication is used only for sending 5 check_app_status("vDrone-1", SUCCESS)
heartbeats from the v-Controller to the v-drone, and periodic 6 check_violations("vDrone-1", type=’flight_path’)
status messages in the opposite direction.
Listing 3: Analyze other test results.
Figure 5 depicts several aspects of the testing scenario. The
white line connects the waypoints as specified in the flight
plan, while the light green triangle indicates the (wider) area B. Runtime Checking
(generated by the PaaS platform) where the drone is allowed For the runtime checking using the digital twin configura-
to fly. If the application tries to cross this border by issuing a tion, we use a custom-made hexacopter, shown in Figure 6.
navigation command, or the drone moves out of these bounds Besides the autopilot board with the basic flight-related sensors
due to external factors (e.g., weather conditions), a violation and actuators, the drone features a Raspberry Pi 3 onboard
occurs, in which case the Drone Runtime should apply the computer with the Drone Runtime environment of the PaaS
respective corrective action, namely to stop application exe- platform, which is used for the application deployment and
cution and instruct an immediate landing. The dark blue line execution. The onboard computer is also used to run the agent
shows a test run where the application successfully completes of the testing framework.
the mission and behaves according to the flight plan. Listing 4 shows the script for setting up such a test. In
To test that the Drone Runtime can indeed detect the specific a first step, an object representing the real drone is created
violations and perform the required actions, we run a test and its configuration, i.e., application, parameters, approved
scenario where, after reaching WP2, a strong wind gust hits flight plan, is retrieved (lines 1-3). Then, a virtual drone is
the v-drone for a period of 10 seconds (Listing 2). configured in digital twin mode and is initialized using the
1 if drone.get_pos() == config.WP2: application/flight plan configuration of the real drone. Finally,
2 drone.set_wind(wdir=315, speed=20, timeout=10) the two entities are connected with each other in order to
Listing 2: Setup simulated wind. enable the streaming of all the state-related data from the real
drone to the digital twin (line 7).
The corresponding flight trace is depicted in Figure 5 using a 1 real_drone = realDrone("drone-1")
red color. As can be seen, the line stops shortly after the drone 2 test_config = real_drone.get_config(app, params, plan)
crosses the green boundary. This is because the Drone Runtime 3 vdrone = vDrone("vDrone-1", DT)
4 vdrone.set_app(test_config.app,
indeed detects the violation and immediately performs an 5 params=test_config.params)
emergency landing, as required. 6 vdrone.set_plan(test_config.plan)
Apart from this kind of functional evaluation, there are 7 connect_drone_with_dt("drone-1", "vDrone-1")
also other checks that can be performed during simulation Listing 4: Setup and start online test.
execution or after its completion. Indicative examples are given
in Listing 3. These checks can have the form of queries or To demonstrate the detection of unexpected deviations at
assertions. In the first case, specific data are returned, such as runtime, we modify the application that runs on the real
information regarding application violations, the size of data drone so that it executes a slightly different mission. More
sent, or the average system resources used by the application specifically, after reaching WP2, it issues a request to navigate
during execution (lines 1-3), in order to, e.g., compare them towards waypoint WP-R3, as shown in the dark green path
by hand with the expected outcome in case of successful depicted in Figure 7. In contrast, the application running

203
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

support a variety of popular communication protocols, but


are pretty heavyweight in terms of required resources, which
makes it harder to perform simulations that involve a large
number of vehicles.
Also, there are frameworks that combine different simu-
lation aspects through a single environment. AeroLoop [12]
and FlyNetSim [24] provide simulation setups for conducting
flexible application experiments using multiple virtual drones
with Wi-Fi communication capabilities. Both frameworks uti-
lize ArduPilot SITL [25] for the simulated drones and the ns-3
network simulator for the wireless communication. AeroLoop
needs more resources, and it is better hosted in cloud-like com-
puting infrastructures, since each simulation entity corresponds
to a separate virtual machine (VM), instead of the lightweight
design of FlyNetSim, where each simulated drone corresponds
Fig. 7: Runtime checking experiment. to a thread. However, AeroLoop achieves complete isolation
between the different simulation entities and virtual drones
also have access to a virtual camera sensor. UTSim [26] is
(correctly) on the digital twin requests a movement to WP3 (as another recent simulation framework based on the Unity game
it should), which corresponds to the white line. This deviation engine which is focused on studying air traffic integration
is detected and raises an alarm. The digital twin terminates at issues like sense and avoid, navigation and path planning
this point, and it is up to the operator to decide which action algorithms. The main drawback is that testing scenarios can
to take concerning the real drone. be implemented only using its own user interface, without
The API for comparing the behavior of the application run- offering integration capabilities regarding the most popular
ning on the real drone vs. on its digital twin is straightforward, application frameworks in the drone domain, like DroneKit
as shown in Listing 5. and ROS, and the respective communication protocols, e.g.,
MAVLink.
1 vdrone.compare_app_requests()
2 vdrone.list_app_requests(source=real) In this work, we utilize AeroLoop’s design, but significantly
3 vdrone.list_app_requests(source=dt) enrich its functionality by providing a digital twin configu-
4 vdrone.dt_exec("docker stats --no-stream") ration. Furthermore, we extend its simulation capabilities by
5 vdrone.app_exec("top -b -d10 -n2 >top_results.txt")
integrating more flight dynamics simulators, like Gazebo, and
Listing 5: Check application behavior at runtime. supporting a more lightweight execution through LXDs instead
of VMs.
During execution one may continuously compare the re- Assessment of cyber-physical systems. There is a variety
quests issued by the application on the digital twin with those of approaches towards assessing systems operating in dynamic
issued by the application on the real drone, or simply list environments. A broad concept that can be used at all stages of
these requests in an explicit way (lines 1-3). Also, as long the (design-build-operate) lifecycle of such systems is that of
as the execution of the digital twin is identical to that of the the digital twin (DT), which is a dynamic virtual representation
real drone, one can inspect several application performance of a physical system/entity, consisting of a simulation model
aspects simply by querying the digital twin, e.g., by executing and data coming from the real world. The intended use
commands to check the application’s container state, or even determines the models to be used. It may vary from achieving
the detailed resource usage of specific processes (lines 4-5). better design or manufacturing, to running what-if simulations
to predict failures or optimize performance.
VI. R ELATED W ORK For instance, [27] combines a vehicle dynamics simulation
Drone simulators & frameworks. A well-known method of a virtual vehicle, which is the digital twin of the real
for testing drone-based applications before deployment is one, and a traffic simulation, which provides the behavior of
through software-in-the-loop (SITL) and hardware-in-the-loop the other traffic participants, during the development phase
(HITL) simulation environments. Gazebo [22] is a feature- of automated driving functions in order to identify critical
rich 3D robotic simulation platform that can be combined scenarios and improve the provided functionality. On the other
with a variety of physics engines and sensor models, and hand, [28] presents a robotic system that employs predictive
through suitable plugins can be used for SITL/HITL testing. runtime validation through look-ahead simulation in order
AirSim [23] is a more recent platform that offers physically to accomplish its goal while assuring its safety. To achieve
and visually realistic simulations through the usage of the this, it uses the Stage robot simulator [29] to assess in real-
Unreal Engine, and is focused on enabling developers of time all possible actions of the decision search tree (coming
autonomous systems to generate large amounts of training data from the robot and the dynamic environment), predicts their
to be used by machine learning algorithms. Both simulators consequences, and selects the most appropriate one. Also, [30]

204
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

proposes the runtime monitoring of software components [7] J. Yapp, R. Seker, and R. Babiceanu, “UAV as a service: Enabling on-
through the execution of their digital twin (which are con- demand access and on-the-fly re-tasking of multi-tenant UAVs using
cloud services,” in Proc. IEEE/AIAA Digital Avionics Systems Confer-
sidered abstract specifications) in a simulated environment in ence, 2016.
order to detect and mitigate malicious behaviors. [8] A. Koubâa, B. Qureshi, M.-F. Sriti, A. Allouch, Y. Javed, M. Alajlan,
The authors in [31] argue that in software systems offline O. Cheikhrouhou, M. Khalgui, and E. Tovar, “Dronemap Planner: A
service-oriented cloud-based management system for the Internet-of-
verification before deployment must be accompanied by quan- Drones,” Ad Hoc Networks, vol. 86, pp. 46–62, 2019.
titative online verification of the key requirements at runtime [9] A. Van’t Hof and J. Nieh, “AnDrone: Virtual drone computing in the
in order to achieve software dependability and adaptiveness, cloud,” in Proc. EuroSys Conference, 2019, pp. 6:1–6:16.
[10] N. Grigoropoulos and S. Lalis, “Flexible deployment and enforcement of
through the identification, and sometimes prediction, of re- flight and privacy restrictions for drone applications,” in Proc. IEEE/IFIP
quirement violations. Along the same lines, in this work, we International Conference on Dependable Systems and Networks Work-
adopt such a holistic checking approach through a framework shops (DSN-W), 2020, pp. 110–117.
[11] F. Bonomi, R. Milito, P. Natarajan, and J. Zhu, “Fog computing: A
that provides the means to test the various software entities of platform for internet of things and analytics,” in Big Data and Internet
a PaaS system both in an offline and online fashion. of Things: A Roadmap for Smart Environments. Springer International
Publishing, 2014, pp. 169–186.
VII. C ONCLUSION [12] M. Koutsoubelias, N. Grigoropoulos, and S. Lalis, “A modular simu-
lation environment for multiple UAVs with virtual WiFi and sensing
We have presented our approach for supporting offline val- capability,” in Proc. IEEE Sensors Applications Symposium, 2018.
idation and runtime checking in the context of a PaaS system [13] J. Petazzoni, “Build reliable, traceable, distributed systems with Ze-
roMQ,” https://us.pycon.org/2012/schedule/presentation/260/.
for drone-based applications, through suitable simulation and [14] G. F. Riley and T. R. Henderson, “The ns-3 network simulator,” in Mod-
digital twin mechanisms. Also, we have discussed key aspects eling and Tools for Network Simulation. Springer Berlin Heidelberg,
of our implementation and have illustrated its functionality 2010, pp. 15–34.
[15] A. R. Portabales and M. L. Nores, “Dockemu: Extension of a scalable
through indicative simulation and real-world scenarios. network simulation framework based on docker and NS3 to cover IoT
The proposed framework has been successfully used in scenarios,” in Proc. International Conference on Simulation and Mod-
research projects, focused on the pre-deployment testing of eling Methodologies, Technologies and Applications (SIMULTECH),
2018, pp. 175—-182.
experiments with unmanned vehicles and the validation of [16] ZeroMQ, “Open-source messaging library,” https://zeromq.org/.
an automated inspection system of photovoltaic parks using [17] MAVLink, “Drone communication protocol,” https://mavlink.io/en.
drones. At the same time, we are continuously working on [18] DroneKit, “Developer tools for drones,” http://dronekit.io/.
[19] ROS, “Robot Operating System,” https://www.ros.org.
different improvements and extensions. On the one hand, we [20] ArduPilot, “Open source autopilot,” http://ardupilot.org.
wish to explore ways of enriching the digital twin setup [21] “MAVProxy,” http://ardupilot.github.io/MAVProxy/html/index.html.
in order to have the ability to run predictive simulations at [22] N. Koenig and A. Howard, “Design and use paradigms for gazebo,
an open-source multi-robot simulator,” in Proc. IEEE/RSJ International
runtime. On the other hand, we are in the process of integrating Conference on Intelligent Robots and Systems (IROS), 2004, pp. 2149–
yet another form of runtime testing, through the support of 2154.
suitable drills that imitate specific problematic situations in [23] S. Shah, D. Dey, C. Lovett, and A. Kapoor, “AirSim: High-fidelity visual
and physical simulation for autonomous vehicles,” in Field and Service
the PaaS in order to check the successful triggering of the Robotics, 2017. [Online]. Available: https://arxiv.org/abs/1705.05065
respective compensating actions. [24] S. Baidya, Z. Shaikh, and M. Levorato, “FlyNetSim: An open source
synchronized uav network simulator based on ns-3 and ardupilot,”
ACKNOWLEDGMENT in Proc. ACM International Conference on Modeling, Analysis and
Simulation of Wireless and Mobile Systems, 2018, p. 37–45.
This research has been co–financed by the European Union [25] ArduPilot, “SITL Simulator,” http://ardupilot.org/dev/docs/
and Greek national funds through the Operational Program sitl-simulator-software-in-the-loop.html.
[26] A. Al-Mousa, B. H. Sababha, N. Al-Madi, A. Barghouthi, and
Competitiveness, Entrepreneurship and Innovation, under the R. Younisse, “UTSim: A framework and simulator for UAV air traf-
call RESEARCH — CREATE — INNOVATE, project PV- fic integration, control, and communication,” International Journal of
Auto-Scout, code T1EDK-02435. Advanced Robotic Systems, vol. 16, no. 5, 2019.
[27] S. Hallerbach, Y. Xia, U. Eberle, and F. Koester, “Simulation-based iden-
tification of critical scenarios for cooperative and automated vehicles,”
R EFERENCES SAE International Journal of Connected and Automated Vehicles, vol. 1,
[1] A. R. Singh, “How Drones are crucial for Smart Cities?” https://www. no. 2, pp. 93–106, 2018.
geospatialworld.net/blogs/how-drones-are-crucial-for-smart-cities/ [28] C. Blum, A. F. T. Winfield, and V. V. Hafner, “Simulation-based internal
(2018-09-04). models for safer robots,” Frontiers in Robotics and AI, vol. 4, 2018.
[2] E. Vattapparamban, I. Guvenc, A. I. Yurekli, K. Akkaya, and S. Uluagac, [29] R. Vaughan, “Massively multi-robot simulation in stage,” Swarm Intel-
“Drones for smart cities: Issues in cybersecurity, privacy, and public ligence, vol. 2, no. 2-4, pp. 189–208, 2008.
safety,” in Proc. International Wireless Communications and Mobile [30] E. Cioroaica, F. D. Giandomenico, T. Kuhn, F. Lonetti, E. Marchetti,
Computing Conference, 2016, pp. 216–221. J. Jahic, and F. Schnicke, “Towards runtime monitoring for malicious
[3] D. Wright and R. Finn, “Making drones more acceptable with pri- behaviors detection in smart ecosystems,” in Proc. IEEE International
vacy impact assessments,” in Information Technology and Law Series. Symposium on Software Reliability Engineering Workshops (ISSREW),
T.M.C. Asser Press, 2016, vol. 27, pp. 325–351. 2019, pp. 200–203.
[4] NASA, “Unmanned Aircraft System (UAS) Traffic Management [31] R. Calinescu, C. Ghezzi, M. Kwiatkowska, and R. Mirandola, “Self-
(UTM),” https://utm.arc.nasa.gov/index.shtml. adaptive software needs quantitative verification at runtime,” Communi-
[5] SESAR, “U-Space,” https://www.sesarju.eu/U-space. cations of the ACM, vol. 55, no. 9, pp. 69–77, 2012.
[6] M. Murisonon, “Drones Will Be Shot Down Until These
Misconceptions Are Tackled,” https://dronelife.com/2019/03/04/
drones-will-be-shot-down-until-these-misconceptions-are-tackled/
(2019-03-04).

205
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

E-Scooter Sharing: Leveraging Open Data for


System Design
Alessandro Ciociolaa , Michele Coccaa , Danilo Giordanob , Luca Vassioa , and Marco Melliaa
a Department of Electronics and Telecommunications, Politecnico di Torino, first.last@polito.it
b Department of Control and Computer Engineering, Politecnico di Torino, first.last@polito.it

Abstract—With the shift toward a Mobility-as-a-Service usage is nowadays comparable with popular car ride-sharing
paradigm, electric scooter sharing systems are becoming a services like Uber and Lyft [4].
popular transportation mean in cities. Given their novelty, we Since 2017 several e-scooter companies started their ser-
lack of consolidated approaches to study and compare different
system design options. In this work, we propose a simulation vices in many cities in North-America and Europe. Internet-
approach that leverages open data to create a demand model of-Things technologies, paired with accurate GPS tracking,
that captures and generalises the usage of this transportation allow the providers to track the position of the e-scooters and
mean in a city. This calls for ingenuity to deal with coarse monitor users’ trips. These data can be used to understand the
open data granularity. In particular, we create a flexible, data- impact and the utilization of e-scooter in the smart city mobil-
driven demand model by using modulated Poisson processes for
temporal estimation, and Kernel Density Estimation (KDE) for ity ecosystem. In this direction, municipalities started offering
spatial estimation. We next use this demand model alongside a open data to let other players study alternative solutions.
configurable e-scooter sharing simulator to compare performance In this work we are the first - to the best of our knowledge
of different electric scooter sharing design options, such as the - to study the service sustainability of e-scooters systems from
impact of the number of scooters and the cost of managing their the point of view of a provider. Notice that the peculiarities
charging. We focus on the municipalities of Minneapolis and
Louisville which provide large scale open data about e-scooter of this novel scenario call for new approaches (see Section II
sharing rides. Our approach let researchers, municipalities and for a discussion). In this work, we consider the municipalities
scooter sharing providers to follow a data driven approach to of Minneapolis and Louisville as use cases.
compare and improve the design of e-scooter sharing system in First, we need to understand how, when and where e-
smart cities. scooters are used by the users, i.e., the mobility demand.
Index Terms—open data, demand model, scooter sharing,
electric vehicle, data driven optimization.
For this purpose we rely on open data. Open data typically
shares coarsely aggregated data for privacy reasons. This chal-
lenges its usage, and calls for ingenuity to appropriately pre-
I. I NTRODUCTION process data with spatio-temporal disaggregation techniques
Urban mobility presents a number of non-trivial challenges to increase resolution and derive a flexible - albeit realistic -
both for researchers and regulators. Some of these challenges demand model. For this, we combine Poisson processes for
are related to sustainability and pollution: in EU, for example, customers’ arrivals, and Kernel Density Estimate to model the
urban mobility accounts for 40% of all CO2 emissions of spatial demand [5]. To allow other researchers to reproduce
road transport and up to 70% of other pollutants comes and extend our results, we make our demand models available
from transport systems.1 The needs to reduce emissions and upon request.
congestions, along with the rising of the sharing economy, Afterwards, we leverage the constructed demand model to
moved several policy makers in promoting micro-mobility run simulation studies to compare different fleet management
services in cities. These services refer to lightweight, often policies, with a focus on battery charging strategies. For this
electric-powered vehicles rented for short trips and typically we extend our simulator implemented in [6] to support e-
operating at low speeds. scooters scenarios. The simulator allows us to model system
In this context electric scooters (e-scooters) represent a parameters such as the operative area granularity, vehicles
sustainable and cheap alternative to reduce the number of characteristics, fleet size, users’ preferences or fleet manage-
private vehicle trips [1] and consequently traffic congestion [2] ment policies. It simulates the search, rental, and return of e-
and land use [3]. Indeed, e-scooters are among the fastest scooters by customers, and the battery consumption and charg-
growing electric micro-mobility means. The number of compa- ing operations needed to maintain the fleet. As performance
nies offering e-scooters to rent and the number of cities where metrics, we mainly focus on satisfied trips, i.e., the fraction of
the service is available keep growing. Indeed, the e-scooter customers’ requests that the system can accommodate; and the
fleet management cost, proportional to the time workers have
1 https://ec.europa.eu/transport/themes/urban/urban\_mobility\_en to spend to reach and charge the e-scooter battery, assuming
a battery swap policy.
The results show that with a spatio-temporal disaggregation
978-1-7281-7343-6/20/$31.00 ©2020 IEEE coupled with Poisson process and the Kernel Density Estimate

206
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

we can create a reliable demand model to perform accurate analyze customers’ mobility demand patterns in a free floating
simulations. Our findings show how e-scooter operators should car sharing system. Here we revisit our methodology in the
carefully evaluate the best trade-off to balance the users’ context of micro-mobility.
satisfaction and the fleet management costs. In particular, we Considering shared electric vehicle systems in general, the
show (i) the impact of the size of the fleet, (ii) the impact major challenge is the battery charging process. The battery
of the choice of when to swap/charge the batteries, (iii) the swap appears the most suitable approach for e-scooters but
implications of using workers or asking the users’ cooperation its study has never been explicitly targeted so far in scientific
for charging operations. literature. Early studies about e-buses [17] focus their attention
Results show that the very heterogeneous demand calls for on the management of possible battery switch station and their
a large number of e-scooters. Similarly the fleet management placement. Other models focus on optimizing the charging
operations have a high cost due to many battery swap opera- process for large vehicles taking into account electric network
tions. Furthermore, reducing the time for workers to reach the constraints and system degradation [18], or considering the
e-scooters and change their batteries has a fundamental impact distance travelled to reach a battery switch station [19]. In
for reducing cost. Alternatively, directly involving the users in a recent work [20], authors optimize battery switch stations
the charging process would further reduce costs, becoming a considering costs of energy, equipment degradation and energy
key design decision. demand variability.
The paper is organized as follow. In Sec. II we discuss Few studies analyze the battery swap process applied to
existing works about e-mobility and charging solutions. In shared vehicles. Authors of [21] proposes a mixed integer
Sec. III we describe and characterize the used open datasets. programming formulation to maximise the satisfied trips in
In Sec. IV we introduce the spatio-temporal disaggregation an electric station-based car sharing system, minimizing at
techniques to create our demand models, as well as the the same time the number of battery swaps. Authors of [22]
simulation assumptions and performance metrics. In Sec. V propose an optimal schedule for EV battery swap at stations
we show results of our methodology for the cities of Louisville minimizing travel distance and electrical usage. Differently
and Minneapolis. Finally in Sec. VI we summarize the paper from our work, all these models do not fit the e-scooter
and present future directions. scenario because they do not consider small vehicles and small
batteries, hence they do not allow local swap of the batteries.
II. R ELATED WORK
Impact of e-scooters in urban mobility is an emerging re- III. DATA COLLECTION AND CHARACTERIZATION
search topic. The seminal works [7] tested in 2011 the benefits In this section we describe the datasets and characterize
of e-scooters on commuters. Since then, few other studies the system usage focusing on the most important metrics that
have tried to gauge the impact of e-scooters on mobility. For would impact the design of an e-scooter sharing system.
instance, authors of [8] present an extensive market analysis
A. Dataset description
emphasizing the possible growth in the usage of e-scooters and
raising the problem of how to handle the charging process We focus our study on two cities in the US, namely
in presence of large fleets. As a possible solution, authors Louisville and Minneapolis, where their municipalities make
of [9] propose a model where a MILP formulation clusters available data about all the e-scooter rides performed by the
together the e-scooters that need to be charged. Similarly, customers using any of the e-scooter sharing providers present
authors of [10] study the benefits of electric fleet (of e-scooters in each city.2 To protect the riders privacy and do not leak any
and e-bikes) in last mile delivery for big players in Milan. They company-specific strategy, data do not contain any identifier
are among the first to exploit real data - albeit collected from of the company, or vehicle, or customer. Furthermore, data
a very limited deployment (less than 75 vehicles). Authors is aggregated and/or fuzzed following NACTO guidelines in
of [11] offer a first users’ habits characterization collecting order to make the user tracking impossible. 3 This challenges
the daily trips of 38 users, pointing out how the leisure the direct usage of the open data, and calls for ingenuity to
component is relevant for e-scooters. More recent works ([12], derive suitable models.
[13]) compare micro-mobility services (dockless bike, e-bike In our cases, each trip exposes information describing the
and e-scooters) using data exposed by providers. The results trip duration, distance, starting and ending position, and the
confirm that users prefer e-scooters to cover trips shorter than time when the trip started. Different quantisation applies. For
1.6 km. Moreover the e-scooter daily patterns do not match the Louisville, starting and ending position are encoded with GPS
commuting patterns. In [14] the authors show that the number coordinates rounded at 3 decimals (approximately 80 m bins);
of bookings per hour is higher in good weather condition. trip duration is given with a precision of one minute, and
These characteristics reinforce the need of specific models and the starting timestamp is rounded to the closest 15 minutes
tools to study this new type of mobility. period. Minneapolis data expose similar information but even
To the best of our knowledge, our work is the first to more aggregated. Origins and destinations position are defined
present a holistic approach to study and compare different 2 Datasets are available at: https://data.louisvilleky.gov/dataset/
system design options, leveraging large open data. We follow a dockless-vehicles, and http://opendata.minneapolismn.gov/search
similar approach as in our previous work [15], [16] where we 3 National Association of City Transportation Officials https://nacto.org/

207
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

TABLE I: Dataset characteristics (Jul. 1st to Sep. 30th, 2019)


8000 Minneapolis
Louisville
Trip duration Trip distance Operative
Trips per day
6000 City Trips
Fleet
[min] [km] Area
Size
Avg Med Avg Med [km2 ]
4000 Minneapolis 511 k 2 000 13’30" 8’00" 0.95 0.45 268
Louisville 187 k 850 13’50" 7’55" 1.78 1.20 83
2000

9 9 9 9 9 9 9
08 Jul '1 22 Jul '105 Aug '119 Aug '102 Sep '116 Sep '130 Sep '1
Date hour per weekday (solid line) and per weekend (dashed line).
As expected, Minneapolis exhibits more trip per hour than
Fig. 1: Time series of trips per day Louisville. At night we observe a negligible number of trips,
with Louisville showing slightly higher figures probably due
600 Minneapolis WD to a more vivid nightlife. During weekdays we observe a high
Average Trips per hour

Minneapolis WE
Louisville WD utilization during central hours of the day (12:00 to 17:00)
Louisville WE rather than during commuting hours. This drastically differs
400
from what commonly observed for other shared transportation
200 means like car sharing [15] where utilization peaks during
commuting time. Regarding weekends, Louisville confirms
0 the higher utilization with about 30% more trips than during
0 1 2 3 4 5 6 7 8 9 1011121314151617181920212223 weekdays. This result highlights the importance of a correct
Hour of day
characterization of different transportation means usage -
Fig. 2: Average trips per hour in weekends (WE) and working which results fundamental to study system design alternatives.
days (WD) We now focus on the characterization of two important
metrics: (i) trip duration, (ii) and trip covered distance. These
metrics are fundamental to understand the e-scooter avail-
by street IDs so that each trip refers to an entire street length ability and battery discharge properties. Fig. 3a reports the
rather than precise coordinates. Timestamps are rounded to Empirical Cumulative Distribution Functions (ECDFs) of the
the closest 30 minutes period. This rounding are essential to trip duration for each city during weekdays and weekends. The
protect the users’ privacy, but they complicate the extraction of similarity in the duration is striking, with both Minneapolis
useful insights from the data. The granularity of rides duration, and Louisville trips lasting longer during the weekdays than
distance, day and hour of the day still allow us to extract useful the weekends. Recall that Louisville dataset exposes time du-
patterns about e-scooter usage over time. However, the absence ration with a minute granularity which causes the quantisation
of e-scooter identifier, precise coordinates and timestamps seen in the ECDF. Overall, trip duration is very short, with the
makes impossible to track how each e-scooter moves in the majority of the trips lasting less than 13 minutes. This reflects
city. Thus we cannot simply reply the same trace in a simulator on the trip distance, as seen in Fig. 3b. Observe that almost
as done for car sharing services (e.g., in [15]). 90% the trip lasts less than 4 km, and more than 60% are
shorter than 2 km. These results confirms the typical usage of
B. Dataset characterization
e-scooters [12], [13]. Notice also the different service area size
First we provide a data characterization to let understanding of Minneapolis and Louisville which allows for longer trips
the scenarios we are facing. In Fig. 1 we report the number in the former. Table I provides a summary of the data.
of total recorded trips (i.e., rentals) for each day over the Considering spatial characterization of the demand we ob-
months of July, August and September 2019. More than half a serve that most of trips are confined in few relatively small
million and 180 k trips have been recorded in Minneapolis and neighborhoods. Fig. 4 show heatmaps to intuitively gauge this
Louisville, respectively. Interestingly, while Louisville shows a effect. Here, we divide the service areas of each city in 200 m
repetitive weekly pattern with peaks over weekend but without x 200 m cells. Then we count the number of trips originating
any specific trend, Minneapolis exhibits an increasing trend. in each cell during the three months. We use a decimal
The different number of daily trips justifies the difference in logarithmic scale. The heatmap shows how concentrated trips
size among the cities, with Minneapolis having more than are, with few hotter (in red) cells that accounts for 4 orders of
twice as much the e-scooters in Louisville (see Table I).4 Some magnitude more trips than those cells with few trips (in blue).
sudden falls are related to bad weather conditions that affects Overall, these informations are fundamental to generate a
the willingness of customers to rent an e-scooter [14]. demand model to compare different system designs.
To analyze how the demand is distributed during the hours
of the day, in Fig. 2 we report the average number of trips per IV. S YSTEM MODEL AND SIMULATOR
4 As no vehicle ID is present the maximum number of vehicles is extracted
In this section, we first describe the spatio-temporal disag-
from Louisville service description2 and Minneapolis official website http: gregation methodology that we employ to generalize the trips
//www.minneapolismn.gov/publicworks/trans/WCMSP-212816 present in the open data. Second we detail how we use them

208
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Minneapolis WE Minneapolis WD Louisville WE Lousville WD possibility to smooth the point process of a trace over a multi-
dimensional space while maintaining the origin/destination
1.0 1.0 correlation.
0.8 0.8 1) Time Modeling: We assume that the inter-arrival time of
0.6 0.6 trips follows an exponential distribution with rate that depends
ECDF

ECDF
on the type and hour of the day. To account for the highly
0.4 0.4 periodical rate as seen in Fig.1, here we distinguish between
0.2 0.2 weekday and weekends. We consider 24 time bins of 1 h each
(48 periods in total), where the Poisson arrival rate reflects the
0.0 0.0
0 20 40 60 0 2 4 6 8 10 average rate of requests in the original dataset. This allows to
Duration [min] Distance [km] scale the overall demand by introducing a global scaling factor.
(a) Trip duration (b) Trip distance Not reported here for brevity, we compare the number of trips
in the simulated and the disaggregated trace. As expected,
Fig. 3: ECDFs for trips duration and distance in weekends
there is a very good match (relative percentage residuals for
(WE) and working days (WD)
total trips between 0.6% and 1.3% for Louisville, 0.8% and
3.4% for Minneapolis)
to generate our demand model. Finally we use the demand 2) Spatial Modeling: Given an hour and a day, we want to
model to determine the occurring trips and feed our mobility generate origin and destination of a request according to the
simulator. specific demand model as exhibited in the disaggregated trace.
For this, we leverage KDE to estimate the joint probability
A. Spatio-temporal disaggregation distribution of the origin and destination positions of a trip.
Assume we have a dataset D of trips recorded during Given our scenario, this is fundamental to further smooth our
a given period of time. Each trip i ∈ D is defined by discrete events.
a discrete start time as (i), i.e., with time rounded with a To ease the KDE computation and the simulation process,
granularity ∆T (of 15 or 30 minutes in our case). To provide we divide the whole city area into contiguous squared zones
an estimation of the time instant in which the trip started, of side 200 m and map the trips to this grid.
we assume a local stationary process, and simply extract a Then, for each of the 48 time bins we fit a separate
new timestamp ts (i) from a uniform distribution in range
 KDE based on the origin-destination zone grid, obtaining a
∆T ∆T
as (i) − , as (i) + . This allows to get back to a four dimensional problem (2 coordinates for origin and 2
2 2
continuous-time trace of events. coordinates for destination). In this way, we obtain 48 models
Considering the spatial information, origin o(i) and des- summarising the spatial mobility habits of the users in time.
tination d(i) positions may be already associated to spatial Here, we consider a Gaussian kernel [5] and set the bandwidth
coordinates, albeit rounded. First, we compute the distribution matrix of the KDE to the 4 x 4 identity matrix. Given the 200 m
of distance between o(i) and d(i) which will be useful to x 200 m zoning, this corresponds to a bandwidth selection
generate trip distances later. Second, we obtain, for each of 200 m for each coordinate. On the one hand a smaller
(o(i), d(i)) pairs, the trip duration from the open data. bandwidth would not help us to generalize the demand. On the
Origin and destination information might be aggregated into other hand, a bigger bandwidth would reduce the granularity
different geometries oid (i) and did (i). We have to employ a of city zoning, leading to a reduced precision in incorporating
spatial disaggregation methodology to derive possible coor- spatial patterns.
dinates. In Minneapolis case, oid (i) and did (i) are segments In a nutshell, we use KDE as a spatial data smoothing
representing streets and we randomly select two coordinates tool, able to capture mobility patterns from the trips in the
along the entire street (with a uniform probability). We obtain disaggregated trace while reducing the impact of the original
thus a possible origin o(i) and destination d(i) coordinates for open data aggregation. This is also very effective to cope with
each trip i. the fine grained spatial quantisation that is needed to model the
At the end of this pre-processing step, we have a new dis- demand of e-scooter sharing systems. To show how effective
aggregated trace where each trip in the dataset is characterized this is, in Fig. 4 we report the demand in each zone before and
by its start time, and initial and final coordinates. after applying the smoothing procedure for Louisville (Fig. 4a)
and Minneapolis (Fig. 4b). To ease the readability we report
B. Demand model only the demand in the peak hour. Looking at the demand
The goal of the demand model is to generalize the trace before the smoothing, most of the trips are concentrated in a
generated from the original open data. For this, we model the few areas with large differences also between nearby cells -
demand in time by using modulated Poisson processes - a resulting in a very noisy picture. Most popular zones do not
common accepted model for i.i.d. service requests of a very change with the smoothing, but we observe a redistribution of
large population [23]. For space, we generalize the demand the requests among neighboring zones. In a nutshell, trips are
using Kernel Density Estimation (KDE) [5]. KDE gives us the no more concentrated in single cells but rather in larger areas.

209
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

0.0 0.5 1.0 1.5 2.0 2.5 3.0

2 km 2 km 2 km 2 km

(a) Louisville: demand in open data (left) and in the peak hour model (right). (b) Minneapolis: demand in open data (left)
and in the peak hour model (right).
Fig. 4: Heatmap of the number of trips starting form each zone in a decimal logarithmic scale (the legend report the exponent).
The warmer the color, the higher the value

C. Mobility and charging simulator trip, updates its battery charge c(s∗ , tj ) = c(s∗ , tj ) − e(j),
ˆ
makes s∗ available in position d(j), and checks if a charging
Our goal is to simulate a fleet of e-scooters that move within
the city. The simulator uses the demand model to generate process is required. That is, it checks if c(s∗ , tj ) < α · C,
mobility requests. During the simulation we track each e- being α ∈ [0, 1] a threshold. If so, it triggers a charging
scooter over time saving information about its location and event.
battery state. The charging operation can be performed either by the
We use an event-based simulator. The simulator has a e-scooter provider through a battery swap operation, or by
set S of e-scooters. At any time t, each e-scooter s ∈ S volunteers through battery charging operation.
is characterised by its location P (s, t) and state of charge System battery swap: the e-scooter provider manages the
c(s, t) ∈ [0, C], where C is the maximum battery capacity. As charge events by means of a workforce of N worker-
previously, we use a 200 m x 200 m grid. At t = 0, e-scooters equivalent. Battery charge requests are modeled with a FIFO
are placed at random proportionally to the spatial demand, queue, with N parallel servers as follow:
with uniform random charge c(s, 0) ∈ [C/2, C]. • Charge request arrival: If there is a free server, the
The model generates trip-request event i at time ti request gets service immediately. Otherwise, the request
according to the Poisson model. It extracts the origin and gets queued and waits to be processed by a worker.
destination coordinates ô(i) and d(i) ˆ from the KDE, and • Service time: the battery swap entails two service oper-
associates the trip duration f (i) and distance ˆl(i) according the
ˆ ations: Reach time, i.e., the time it takes the worker to
CDF extracted from the original open data. The latter allows us physically reach the e-scooter; and the Swap time, i.e.,
to compute the eventual energy consumption assuming simple the time it takes the worker to complete the battery swap
proportionality, i.e., e(i) = k · ˆl(i). We obtain k from the e- operation.
scooter characteristics. When the i-th trip-request event We model the reach time and swap time as negative exponen-
fires, the simulator checks if there is any e-scooter s with tial distributions with average Treach and Tswap .
enough battery c(s, ti ) ≥ e(i) available in the same zone or Volunteer charging: We model the possibility that volun-
1-hop neighbors (the 8 adjacent zones in the grid). This is teers may contribute to fleet energy management, as done by
equivalent to assume that customers are willing to rent an e- some companies that remunerate people to handle the charging
scooter that is within the same or at neighboring zone from of e-scooters. When a charge is needed, a volunteer may be
where they are walking at most approximately up to 300m to found with probability w ∈ [0, 1]. w models people willingness
get it. to contribute to the system. If found, we assume the volunteer
If more than one e-scooter exists, the simulator picks brings the e-scooter at home and plugs it for charging. We
s∗ , the one having the highest c(s, ti ). It then schedules a assume the charging time to be a Gaussian random variable
trip-end event at time ti + fˆ(i). Otherwise, it marks the with average Tcharge and standard deviation σcharge . The
request as unsatisfied. In both cases, it schedules the next charging time is a random variable as it includes the whole
trip-request event at time ti +negexp(λ(ti )), being λ(ti ) process of taking the e-scooter home, charging it, and bringing
the current request rate. When the j-th trip-end event fires it back to the streets - in the same location as before for
at time tj , the simulator picks the e-scooter s∗ used for this simplicity.

210
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

D. Performance metrics
To compare system performance and gauge the impact of
parameters, we consider two fundamental metrics:
i) The Satisfied Demand measures the percentage of trip
requests that can be satisfied due to the presence of e-scooters
with enough energy in the trip origin zone.
ii) The Swap Time measures the total man-time needed to
handle the battery swap operations.
(a) Minneapolis
The simulator also breaks down the satisfied demand to
distinguish between i) no e-scooter is available, and ii) e-
scooters do not have enough energy to complete the trip
request. Similarly, it maps events to the city maps to observe
the city areas where most of these events occurs.

V. R ESULTS
Here we present simulation results obtained starting from
the original open data, from which we first generate a dis- (b) Louisville
aggregated trace, and then extract the trip request model as
Fig. 5: Percentage of satisfied demand and average number of
described above. We use the model to run simulations to gauge
trips per e-scooter per month
the impact of system design choices. In particular we study
the impact of:
• |S|: the e-scooters fleet size;
• α: the battery threshold that triggers a charging operation; B. Impact of charging threshold
• N : the provider workflow size;
• Treach : the average time to reach the e-scooter; Next, we evaluate the impact of the battery threshold α
• w: volunteers’ willingness to handle charging. that triggers charging events. In the one hand, the lower the
We assume an homogeneous fleet of e-scooters having a α, the less frequently e-scooter need to be charged. On the
C = 425 Wh battery capacity and k = 11 Wh / km energy other hand, if α is too low, we may cause users’ discomfort
efficiency, based on average characteristics present on the and loose revenues as the probability to find an e-scooter
market. with not enough energy would increase. If eventually taken,
that e-scooter would suddenly run out of the battery before
A. Impact of fleet size reaching the desired destination. Here we set |S| = 2 000 for
We first evaluate the impact of the fleet size on the satisfied Minneapolis and |S| = 850 for Louisville. Again, we assume
demand. We consider w = 0 and N = |S|, i.e., system takes the ideal charging policy with Treach = Tswap = w = 0 and
care of the charging, with enough workers to immediately N = |S|.
perform the battery swap. To consider ideal scenario, we fix Fig. 6 reports the percentage of trips in which the user
Treach = Tswap = 0. We choose α = 0.2 for Louisville and would run out of battery (left y-axis) and the percentage of
α = 0.4 and Minneapolis - so to guarantee their maximum trips that require a charging at the end of a trip (right y-
distance trips. axis). The latter represents the charging cost for the system.
We report results in Fig. 5a and Fig. 5b for Minneapolis and Starting from this (red curve), observe how the cost linearly
Louisville, respectively. They show the percentage of satisfied increases up to α around 0.5, after which quickly grows to
demand (left y-axis - blue curves) and the average monthly 100%. Indeed, when α approaches 1, every e-scooter needs
number of trips performed by each e-scooter (right y-axis - red to be charged at the end of each trip. Looking at the fraction
curve). Fleet size varies around the currently available number of trips that would not have enough energy to complete them
of e-scooters - 2 000 in Minneapolis and 850 in Louisville. (blue curves), we observe a sudden growth for values of α
The average number of monthly trips per e-scooter de- approaching 0. That is, if we allow the e-scooter battery to
creases with |S|, while the bigger the fleet size - the higher the reach a very low level, the probability of not completing the
probability to find an e-scooter in the desired origin zone - the trips increase. Minneapolis shows the strongest impact with
higher the percentage of satisfied demand. For Minneapolis, up 10% of the trips resulting impossible (for α = 0). Instead,
the currently available 2 000 e-scooters can satisfy less than Louisville exhibits a negligible fraction even for very low α.
50% of the demand. Notice the sub-linear growth, hinting This is due to the shorter distance than Minneapolis - see
that spatial heterogeneity calls for possible relocation policies. Fig. 3b. These results clearly highlight a trade-off between
For instance, for Louisville results are better, with 60% of impossible trips (and loss of revenues) and number of charging
satisfied trips with 850 e-scooters. Doubling the fleet size events (and costs). Our model and simulator allows one to
would increase of just about 15% the satisfied demand. explore this in details.

211
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

(a) Minneapolis
Fig. 7: Minneapolis - percentage of satisfied demand, varying
number of workers and average reach time

(b) Louisville
Fig. 6: Percentage of impossible trips caused by insufficient
battery level (left scale) and number of needed battery swaps
(right scale) by changing battery swap threshold α Fig. 8: Minneapolis - percentage of satisfied demand, vary-
ing number of workers and users willingness (treach = 30
minutes)
C. Impact of charging policy
We evaluate the cost that the provider faces for the charge
operations based on two different charging scenarios: with satisfied demand with respect to number of workers suggests
(w > 0) and without (w = 0) the users’ cooperation. Here we to employ strategies to reduce as much as possible treach . For
fix Tswap = 5 min for the operator, and Tcharge = 4 h, and example, each worker could be assigned to service a limited
σcharge = 30 min for the volunteers. We set Tcharge = 4 h area of the city.
and σcharge = 30 using an average time needed to charge
Finally, we evaluate how the users’ help reduces the charg-
an e-scooter with similar characteristics. Given the ease of
ing cost for the operator. For this, we consider the same
the Louisville case with respect to Minneapolis seen in the
scenario as before, choosing treach = 30 minutes and eval-
previous sections, here we just report the case of Minneapolis,
uating different users’ willingness (w). Intuitively, the more
with α = 0.3. First, we evaluate the cost when the charging
volunteers help the less workers are needed to perform a
operations are performed only by the workers (w = 0). For
battery swap operation. At the same time, due to the longer
this we run simulations with 2 000 e-scooters and evaluate
time for the charge operation by the user, i.e., 4 hours, other
how many workers are needed to satisfy as much demand as
effects may appear like a decrease in the satisfied demand
possible. We define a worker as an always available resource
due to several e-scooters being under charge at the same time.
(24 hours a day) that perform only one battery swap operation
In Fig. 8 we show the impact on the satisfied demand by
a time. Since we model the time to reach the e-scooter (treach )
changing users’ willingness with different number of workers.
as a stochastic variable, we also evaluate its impact in the
As a reference we also include the curve with w = 0
charging cost.5 . Intuitively, when few workers are present, or
(same as in Fig. 7). Despite users’ recharges are generally
when treach is too high, an increase in the charging FIFO
longer, there is a limited impact concerning the availability of
queue happens, causing e-scooters to be not available and
scooters, and therefore satisfied demand. With a willingness
decreasing the satisfied demand.
w = 0.5 we can see how the number of workers needed to
In Fig. 7 we evaluate the percentage of satisfied demand
reach the maximum possible feasible trips halves from 12 in
while increasing the number of workers simultaneously avail-
Fig. 7 to 6 in Fig. 8. With w = 1, the management of the
able in the system, with different values of treach . With small
batteries is completely taken care by volunteers. Interestingly,
treach (15 minutes), we can see how with 8 workers we
the longer unavailability due to longer charging time has
reach the highest satisfied demand as in the best case scenario
negligible impact on the satisfied demand.
(Fig. 5a). The increase of the reach time cause a drop in the
satisfied demand down to 30% when treach = 60 minutes, In Fig. 9 we show the total time employed by workers to
even when 14 workers are present. This strong dependence of perform the battery swap on a daily basis. When w = 1,
workers are not needed - hence the total average daily time is
5 Given our policy that only 1 battery swap operation is allowed per event, 0 hours. When w = 0, there are no volunteers, and the system
if two discarded e-scooters are close to each other we consider two reach time needs up to 250 hours of cumulative daily work to reach the

212
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

[2] J. Hollingsworth, B. Copeland, and J. X. Johnson, “Are e-scooters pol-


luters? the environmental impacts of shared dockless electric scooters,”
Environmental Research Letters, vol. 14, 2019.
[3] PBOT, “2018 e-scooter findings report,” https://www.portlandoregon.
gov/transportation/article/709719, 2019.
[4] J. Fong, P. McDermott, and M. Lucchi, “Micro-mobility, e-scooters
and implications for higher education,” https://upcea.edu/wp-content/
uploads/2019/05/UPCEA_Micro_Mobility-White-Paper-May-2019.pdf,
2019.
[5] R. T. García, M. F. Lopez, J. C. Pérez Sánchez, and R. Pérez Sánchez,
“The kernel density estimation for the visualization of spatial patterns
in urban studies,” in International Multidisciplinary Scientific GeoCon-
Fig. 9: Minneapolis - Cumulative time needed by the workers ference SGEM, 2015.
[6] A. Ciociola, D. Markudova, L. Vassio, D. Giordano, M. Mellia, and
to perform the management of the batteries, varying number M. Meo, “Impact of charging infrastructures and policies on electric car
of workers and users’ willingness (treach = 30 minutes) sharing systems,” in IEEE ITSC, 2020.
[7] J. D. Bishop, R. T. Doucette, D. Robinson, B. Mills, and M. D.
McCulloch, “Investigating the technical, economic and environmental
maximum satisfied demand. w reduces the number of charging performance of electric vehicles in the real-world: A case study using
events the system has to handle, thus the time spent. electric scooters,” Journal of Power Sources, vol. 196, 2011.
[8] C. Sachs, S. Burandt, S. Mandelj, and R. Mutter, “Assessing the market
The results show how e-scooter operator should carefully of light electric vehicles as a potential application for electric in-wheel
evaluate the best trade-off between using workers or asking drives,” in IEEE EDPC, 2016.
the users’ cooperation based on its cost for workers and for [9] M. Masoud, M. Elhenawy, M. H. Almannaa, S. Q. Liu, S. Glaser, and
A. Rakotonirainy, “Heuristic approaches to solve e-scooter assignment
encouraging users’ cooperation. problem,” IEEE Access, vol. 7, 2019.
[10] R. Nocerino, A. Colorni, F. Lia, and A. Lue, “E-bikes and e-scooters for
VI. C ONCLUSION AND F UTURE W ORK smart logistics: environmental and economic sustainability in pro-e-bike
italian pilots,” Transportation research procedia, vol. 14, 2016.
In this paper we proposed a methodology to translate open [11] C. Hardt and K. Bogenberger, “Usage of e-scooters in urban environ-
ments,” Transportation Research Procedia, vol. 37, 2019.
data describing e-scooter sharing usage into a demand model [12] A. Y. Chang, L. Miranda-Moreno, R. Clewlow, and L. Sun. Trend
able to capture and generalize the usage of this transportation or fad? https://www.sae.org/binaries/content/assets/cm/content/topics/
mean in a city. We first converted coarse granularity data micromobility/sae-micromobility-trend-or-fad-report.pdf. [Online; ac-
cessed 24-June-2020].
into a detailed trace. Then, we leveraged modulated Poisson [13] G. McKenzie, “Spatiotemporal comparative analysis of scooter-share
processes and KDE to model the demand over time and and bike-share usage patterns in washington, dc,” Journal of Transport
space. Thanks to a flexible data-driven simulator, we compared Geography, vol. 78, 2019.
[14] J. K. Mathew, M. Liu, and D. M. Bullock, “Impact of weather on shared
different system design options to evaluate the impact of electric scooter utilization,” in IEEE ITSC, 2019.
different e-scooters fleet management strategies. [15] M. Cocca, D. Giordano, M. Mellia, and L. Vassio, “Free floating
Our findings show that the design of an e-scooter system electric car sharing: A data driven approach for system design,” IEEE
Transactions on Intelligent Transportation Systems, vol. 20, 2019.
asks for different trade-off to balance the users’ satisfaction [16] ——, “Data driven optimization of charging station placement for ev
and the management costs. Results show that in order to satisfy free floating car sharing,” in IEEE ITSC, 2018.
the demand and avoid the users to run out of battery we need [17] Y. Zheng, Z. Y. Dong, Y. Xu, K. Meng, J. H. Zhao, and J. Qiu,
“Electric vehicle battery charging/swap stations in distribution systems:
a large number of e-scooters and battery swap operations. comparison study and optimal planning,” IEEE Transactions on Power
Furthermore, we have analyzed different policies for managing Systems, vol. 29, 2013.
the batteries. Reducing the time for workers to reach the e- [18] M. R. Sarker, H. Pandžić, and M. A. Ortega-Vazquez, “Optimal oper-
ation and services scheduling for an electric vehicle battery swapping
scooters and change their batteries has a fundamental impact station,” IEEE Transactions on Power Systems, vol. 30, 2014.
for reducing cost. Moreover, involving the users to contribute [19] X. Zhang and G. Wang, “Optimal dispatch of electric vehicle batteries
in the charging process might further reduce costs. between battery swapping stations and charging stations,” in IEEE Power
and Energy Society General Meeting, 2016.
We believe that our approach can be useful for researchers, [20] G. Battapothula, C. Yammani, and S. Maheswarapu, “Multi-objective
municipalities and e-scooter sharing providers to compare and optimal scheduling of electric vehicle batteries in battery swapping
improve the design of e-scooter sharing system in smart cities. station,” in IEEE PES ISGT-Europe, 2016.
[21] E. S. Rigas, S. D. Ramchurn, and N. Bassiliades, “Algorithms for electric
Our ongoing efforts are focused on three directions: (i) extend vehicle scheduling in mobility-on-demand schemes,” in IEEE ITSC.
our methodology to evaluate the detailed economic aspects of IEEE, 2015.
the different options, (ii) evaluate how to improve the demand [22] P. You, S. H. Low, W. Tushar, G. Geng, C. Yuen, Z. Yang, and Y. Sun,
“Scheduling of ev battery swapping—part i: Centralized solution,” IEEE
model merging contextual data (as in [24]) and (iii) analyzing Transactions on Control of Network Systems, vol. 5, 2017.
possible relocation operations to increase the satisfied demand [23] A. K. Menon and Y. Lee, “Predicting short-term public transport demand
by using solutions (such as genetic algorithms we used in [25] via inhomogeneous poisson processes,” in ACM CIKM, 2017.
[24] M. Cocca, D. Teixeira, L. Vassio, M. Mellia, J. M. Almeida, and A. P.
to improve the users’ satisfaction). Couto da Silva, “On car-sharing usage prediction with open socio-
demographic data,” Electronics, vol. 9, no. 1, p. 72, Jan 2020.
R EFERENCES [25] M. Cocca, D. Giordano, M. Mellia, and L. Vassio, “Free floating electric
car sharing design: Data driven optimisation,” Pervasive and Mobile
[1] M. Lee, J. Y. Chow, G. Yoon, and B. Y. He, “Forecasting e-scooter Computing, vol. 55, pp. 59 – 75, 2019.
competition with direct and access trips by mode and distance in new
york city,” arXiv preprint arXiv:1908.08127, 2019.

213
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

MDP-based Vehicular Network Connectivity Model


for VCC Management
Abubakar Saad Robson E. De Grande
Department of Computers Science Department of Computers Science
Brock University, Canada Brock University, Canada
asaad@brocku.ca rdegrande@brocku.ca

Abstract—Vehicular Cloud computing is new paradigm where Cloud Computing and Vehicular Networks rely heavily on
vehicles collaboratively exchange data and resources to support precisely modeling the movement and position of vehicles and
services and problem-solving in urban environments. Charac- devices to provide proper resource management. The current
teristically, such Clouds undergo severe challenging conditions
from the high mobility of vehicles, and by essence, they are rather management methods for Vehicular Clouds dully depend on
dynamic and complex. Many works have explored the assembling mobility, proximity, and position of vehicles [2]. These meth-
and management of Vehicular Clouds with designs that heavily ods require constant, and sometimes exhaustive, monitoring.
focus on mobility. However, a mobility-based strategy relies on This communication management scheme is very costly and
the geographical position of vehicles and its feasibility has been prone to errors, impacting the assembling of Vehicular Clouds.
questioned in some recent works. Therefore, we present a more
relaxed Vehicular Cloud management scheme that relies on Besides high monitoring and control costs, these approaches
connectivity. This work models uncertainty and considers every face situations where there is minimal chance of communica-
possible chance a vehicle may be available through accessible tion where vehicles stay in range very briefly due to their
communication means, such as V2X communications and the high speed. Consequently, the short amount of time vehicles
vehicle being in the range of RSUs for data transmissions. We
are near each other turns the delivery of Cloud services almost
utilize the MDP model to track the state of vehicles and when
there are connected and available for transmission of the data. impractical. Scientific works [1] have extensively questioned
Index Terms—VANETs, Connectivity, Mobility, VCC the feasibility of Vehicular Clouds in highly mobile environ-
ments, such as urban centers. These works have demonstrated
I. I NTRODUCTION and proven that the contact time of vehicles is much shorter
Vehicular Cloud computing (VCC) has been defined as than for providing any services and resources to a requester.
new paradigm where smart and connected vehicles are put Besides, several approaches have explored connectivity in
together to form a mobile Cloud [1]. Recent technological vehicular networks. Delay Tolerant Networks [3], for instance,
advancements have driven the attention in creating VCCs. have coped with low-density vehicle scenarios to guarantee
Serving as support, such advancements allowed smart vehicles some level of packet delivery. Many technologies also make
to contain high processing and storage capacity; these vehicles use of vertical and horizontal handoffs [4], [5] to achieve
can also connect and interact among themselves and with the better networking conditions. These methods explore connec-
Internet through the use of vehicular networks. By creating a tivity opportunities, showing prospects that favorably support
Cloud, these vehicles can collaboratively build a distributed vehicles to be reachable.
system that extends their own individual capabilities. Therefore, due to the existing drawbacks and challenges of
The volume of vehicles in urban centers and their highly exiting mobility-based approaches, we propose an uncertainty-
dynamic mobility position them as valuable resource providers oriented connectivity model. The proposed model aims at dis-
for Smart Cities [1], [2]. Consequently, in the context of covering and mapping resources and content independent from
intelligent transportation systems, VCCs can potentially sup- the communication endpoints, acting similarly to Content-
port urban computing in a wide extent, turning itself into a Centric Networks [6], making use of mobility as a secondary
truly attractive but rather complex resource, data, and service parameter that just enables predictions. The approach provides
management approach to explore. Numerous works have pro- a more relaxed and flexible method of resource discovery and
posed and designed practical services and applications based indexing, increasing the opportunities for forming Clouds and
on Vehicular Clouds [2]. Thus, enhancing how such Clouds are finding content in a rather dynamic distributed environment.
assembled and managed is highly rewarding and significantly The remainder of the paper is as follows. Section II provides
impacts the community to a great extent. an overview on works about configuration and management of
Coping and dealing with the high mobility of vehicles VCCs. Section III describes the problem tackled in this work.
consist of the great challenge in VCC. Works in Vehicular Section IV presents our approach to enhance Vehicular Cloud
This work is funded by the Natural Sciences and Engineering Research management based on connectivity uncertainty. Section V
Council of Canada (NSERC). describes the experimental evaluations and discusses obtained
978-1-7281-7343-6/20/$31.00 ©2020 IEEE

214
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

results. Finally, Section VI presents the conclusion and future Crowdsourcing has been explored to enable pervasive Cloud
work directions. services through outsourcing tasks [14]. The massive number
II. R ELATED W ORKS of devices allows for offloading and environment sensing.
There is a firm assumption that Vehicular Clouds will in-
When involving Intelligent Transportation, approaches have
evitably benefit cooperation and sharing through significantly
tackled resource management to a great extent by exploring
adaptable Cloud management. Recent works have promoted
challenges originated from the high mobility of vehicles [1]. In
such Clouds by tackling diverse aspects, such as mobility,
the majority of cases and works, their core aspects consist of
heterogeneity, and scheduling. Cluster-based Vehicular Cloud
dealing with the unpredictability and highly-dynamic commu-
creation and coordination in highly dynamic urban scenarios
nication topology changes. As exception to such cases, there
aim at following platooning-inspired strategies [15], [16].
are approaches that do not emphasize on mobility – the ones
Based on this approach, the MDP-based modeling of vehicle
admittedly based on static parked vehicles [7]. However, the
mobility attempted to define enhanced resource allocation in
most common scenarios are dynamic, and works have already
Vehicular Clouds [17]. In the same perspective, a mobility
proven the non-feasibility of assembling Vehicular Clouds
model based on Artificial Neural Networks allowed to reduce
and cooperation while vehicles are moving on the traffic
the impact of sudden movement changes for more efficient
network [1]; the fast pace changing scenario makes inviable
resource allocation in Vehicular Clouds [18]. Cluster-based
for vehicles to sustain plausible connections long enough to
approaches support multi-edge computing [19] where vehicles
fulfill minimum sharing requirements. Even enabling platoon-
take the role of service providers. Thus, vehicular gateways re-
oriented Vehicular Micro-Clouds that compose a distributed
lay data, flowing it through the network and being fundamental
larger-scale Cloud cannot guarantee long-standing proximity
role players in spreading data; this approach then heavily rely
and contact. Besides, these approaches rely on overwhelming
on mobility to select gateways accurately, directly impacting
control to assess the movement of nodes continually.
The design of methods to properly form and sustain Ve- the efficiency in accessing and dissipating data.
hicular Clouds involves a range of several areas, including Formal models based on mobility traces undertake a more
the underlying networking protocols. Several works have precise selection of Vehicular Cloud hosts to received of-
already explored novel Content-centric Networks (CCN) to floaded tasks, given their required completion times [20]. The
cope with the dynamic vehicular environments [8], making diversity in urban environments has motivated the SMDP-
use of information-oriented paradigm to target the discovery based modeling to represent the heterogeneous spectrum of
of data more precisely. However, the support of CCN requires computing capabilities of vehicles and RSU to better assess the
frequent updates and mobility prediction models to handle availability and offloading strategies [21]. Eventually, Cloud
communication and connection inconsistencies. Predicting the management culminates in resource allocation, so efficient
position of vehicles, a extensively explored approach in ve- task scheduling is capable of leveraging Fog Computing and
hicular environments, supported the definition of stability of providing additional resources to build up distributed Data
the communication link between vehicles [9]. Equivalently, Centers [22]. Similar to Mobile Cloud computing, heuristic-
the link stability metric has been defined in other works [10] based placement and scheduling algorithms have been defined
according a quantification of wireless link stability based on to offload tasks from Vehicular Clouds to the Cloud to relieve
the movement of vehicles in a rectilinear form. The relative the burden of running numerous applications and sensory
distance, or speed, among vehicles characterizes stability. This services in vehicles [23]. This work advocates the prospect
particular work extends its definition by adding multi-hop that Vehicular Clouds will be vital for backing up IoT.
estimation on stability, tackling the V2V communication sce- Several works have demonstrated the importance of map-
narios, which is employed in a fuzzy-based selection system to ping and modeling link stability and network connectivity for
match with QoS service requests. In summary, mobility mod- Vehicular Networks. Such works have attempted to enhance
els support estimates to benefit the communication VANETs, connectivity through heavily mobility-oriented approaches,
even enabling approaches focused on the content instead of where the position, speed, and acceleration of vehicles and
vehicles to better balance information supply-demand [11]. traffic network topology are substantial elements in their mod-
Undoubtedly, Vehicular Networks are prone to connection els. Complementarily, we have observed steep advancements
losses. Formal modeling and analysis of delay allow the in communication technologies; for instance, cognitive radio
handling of intermittent connectivity in Vehicular Networks has become a key enabling technology of dynamic spectrum
in sparse scenarios [12]. Low RSU coverage implicates in access to achieve better exploitation of radio spectrum [24],
connection losses, which are also influenced by the speed and [25]. As a result, even though mobility is an important
density of vehicles in road segments. The cost in implementing influencing factor to connectivity, urban environments are
an infrastructure to support V2I communication has motivated more comprehensive and contain many more opportunities for
works to search for optimal placement of RSUs on a traffic devices and vehicles to connect over multiple media, following
network so that delay-sensitive applications are not compro- V2V, V2I, and V2X communication methods. Provided these
mised [13]. Effective handling of connection loss facilitates existing and growing alternative and redundant network con-
the implementation of many Cloud applications. With the pro- nections, we propose an approach that more closely matches
liferation of computing-capable devices and vehicles, mobile with the reality of urban centers and can take better advantage

215
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

of cementing technologies to build and manage VCCs.


III. P ROBLEM S TATEMENT
To enable resource sharing through Cloud service model in
which nodes, vehicles, are interconnected and/or connected to
the Internet. Vehicles, being part of a VC, can request services
from other vehicles, Cloudlets, and the remote Cloud servers.
On the other hand, these same vehicles can provide services
to other vehicles and devices in Mobile Edge Computing
paradigm, shortening the communication distance in light of
decreasing service times. The largest problem in managing Ve-
hicular Clouds comprehends the high mobility, which severely Fig. 1. Vehicular Clouds in an urban scenario.
compromises communication stability. The challenge in this relaxed case consists of precisely
Defining the Vehicular Cloud Computing scenario in the determining, or estimating, which vehicles are more suitable
context of urban computing allow us to precisely identify and for fulfilling service requests based on its “connectivity sta-
reason the problem that is observed and tackled in this work. tus”. A connectivity model is sought in this work in order
Thus, we understand that vehicles move around, according for abstracting the intricate, inherent mobility idiosyncrasies.
to a particular mobility model inherent to the characteristics Availability is then the essential factor in classifying and
of a urban centre and following the topology of urban road indexing vehicles that may be participating in the “Cloud”.
segments. In this scenario, connected vehicles exchange data, Modeling the possible connections allows us to set a first
services, and resources through wireless communication links step towards determining, with certain degree of confidence,
in a opportunistic fashion, where connections may be abruptly reachability and suitability of service provider to requests.
interrupted permanently by moving out of range or momen-
tarily by conducting handoffs over multiple access points. IV. MDP- BASED C ONNECTIVITY M ODEL
In this same scenario, we consider the existence of mul-
The model introduced in this work is intended to represent,
tiple possible communication means, such as Vehicle-to-
and estimate, the likelihood in which a vehicle may have its
Vehicle (V2V), Vehicle-to-Infrastructure (V2I), and Vehicle-
connectivity status kept or changed. Through this MDP-based
to-Everything (V2X). As a result, this diversity and multi-
model, we represent the connection status uncertainty from
tude of communication opportunities directly provides a high
the mobility. Optimistically, it provides the best future case
chance for vehicles to be connected to the Internet and to
scenario where vehicles may face. This optimal case can be
each other. This assumption in this work is justified with
accounted for defining which of the available vehicles are
the several technological advancements that made feasible a
more suitable for serving requests over time. It extends the
realistic context where many communication media in urban
traditional resource provisioning models where just capabili-
centres contribute to keep vehicles connected. The connectivity
ties and communication delays are usually considered for pair-
chance is then much higher when compared with the scenarios
matching a requester against a set of providers.
in past works where V2V communication was assumed the
predominant method for vehicles to communicate.
A. Model
Therefore, a possible Vehicular Cloud scenario is depicted
in Figure 1. The figure illustrates a single vehicle connecting We assume that the behavior of vehicles in the sense of
through V2V communication and through LTE in a dynamic their connectivity patterns is modeled either individually or
context where vehicles are moving and supporting the con- collectively, following the uncertainty from their mobility
nection of the vehicle through multihop message passing patterns and the wireless signal propagation and attenuation
protocols. Vehicular Clouds can be built through 1 or more of the environment. To model the uncertainty probabilisti-
sets of vehicles V = {v1 , .., vm }|m ≤ n, where m is the cally, a Markovian Decision Process (MDP) is employed,
maximum number of vehicles that might compose a Vehicular which is defined following a 5-tuple definition Conni =
Cloud and n is the number of vehicles existing in the area. (Xi , Ai , Pai , Rai , γi ). In a more “relaxed” fashion, the model
Such vehicles move around a urban area, following certain maps the opportunities a vehicle may have to connect over a
traffic network topology, where road segments are edges and set of heterogeneous media, involving vertical and horizontal
intersections are vertices. RSUs, R = {ri , .., rp }, are deployed hand-offs, as well as interleaving connections.
over the urban area; these RSUs, besides their role in the Note that we assume in this proposed model that the
traffic management system, provide certain level of coverage to underlying communication system supports the model with
access the V2I. As realistically expected, the area can present relatively recent status data; media and messages exchanges
some cellular communication towers L = l1 , .., lq , which are allow the monitoring of communication conditions. Such mon-
deployed over the area. Vehicles may enter and leave the urban itoring metrics are presumed to be locally fed in the model,
region. These vehicles can connect with each other, defining a constantly, for representing the most recent communication
V2V link cvi vj . We can also have vehicles establishing a link status of a node, vehicle. Vehicles may also resort to periodic
with RSUS, cvi rh , and with cellular towers, cvi lk . Beaconing for probing communication conditions.

216
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

1) State Space: Each vehicle may be connected through Since we assume that transition probability from any state
different means, including multiple means simultaneously. For leads to p(X|x, a) = 1, we can formulate as a starting model
instance, a vehicles may be connected to the “Internet” through set as equal transition probabilities in the situation that there
other vehicles or through an RSU. Each communication mean is no defined probabilities - initial values when starting the
represents a connectivity link that enhances the reachability model. This starting setup is described in Equation 1. Then, for
and network throughput of a vehicle. Each of these connectiv- instance, each individual vehicle can update these probabilities
ity opportunities can be mapped into states, including the over- independently to express their current behavior accurately.
 n
lapping connection situations. In other words, the connection 1/2 c · · · 1/2nc

of a vehicle can transit among states which allow us to produce
Pxi ,xj (a) =  ... .. ..  (1)

the all possible connection combinations in the most optimistic . . 
scenario over a given number of communication means, nc . 1/2nc ··· 1/2nc
The number of combined states can then be cumulatively 4) Rewards: In each model state, there is a reward that
inferred as 2nc − 1 ( 0≤k≤nc nkc ); if we include the state
P
represents the connectivity quality of a vehicle. Thus, an
where there is no connection at all, the model has then 2nc one-step reward is defined as r(xi , ai ), denoting the reward
states. We can also assume that this set of connectivity states using action a in state x. Quality can be a cumulative value
composed a fully connected directed graph, where there is originated from several factors, such as link stability and end-
always a transition leading from state si to state sj . In such to-end bandwidth, where both might be projected over time.
graph at its simplest, there are (2nc )2 edges. In the proposed approach, rewards directly indicate the best
For instance, in a real scenario, we envision that vehicles possible opportunities in terms of network reachability.
may be simultaneously connect with RSUs, other vehicles, and 5) Policies: The Markov policy π of this model is, by def-
LTE towers (nc = 3). Moreover, in the peculiar case of V2V inition, stationary and randomized since πnc does not depend
connections, a vehicle observes a multihop, on or more hops, on nnc , and the selection of the actions follows a probability
connection. However, mapping all possible multihop combi- distribution that generally produces a p(·|xi , ai ). By definition,
nation scenarios in the model leads to exponential growth a policy corresponds to a sequence of transition probabilities
in this discrete State Space. Thus, accounting simplicity, πn (an |hn ) from a n-step history Hn to A, in given number
we summarize the multihop V2V connection to a singular of iterations n ∈ N. Again, policy π is characterized by a
state factor, in which the number of hops directly impacts transition probability π that relates to x ∈ X mapped to a
connection stability and bandwidth; this impact can be either a ∈ A where we summarize as π(A(x)|x) = 1 for all x ∈ X.
measured through end-to-end checks or estimated over time From the model, we can obtain an optimal policy that
periods. As a result, this scenario allows us a set of 8 states corresponds to the “best” actions to maximize rewards in light
with 64 possible distinct transitions. of the conditioning transition probabilities. Therefore, this
2) Action Space: The actions related to gaining or loosing policy is employed as an indicator that optimistically estimates
connectivity as vehicles move around an urban area. As a the communication conditions with vehicles. All vehicles show
result, the actions are grouped in two sets, “connecting” or these optimistic estimates as a baseline that is used to classify
“disconnecting” from certain media. For instance, in a scenario them against service requirements.
where all possible communication media is through another 6) Transition Discounts: Discounts are applied to weigh
vehicle or an RSU, we have as actions the following: connect- down future steps/iterations in relation to the current state xi
ing to vehicle, connecting to RSU, disconnecting from vehicle, where the connection status is. To deal with expected total
and disconnecting from RSU. The uniqueness of the vehicular discounted reward, the discount is defined as a fixed value in
network scenario grants the representation of transition from this model as γ ∈ [0, 1[. We express the expected total reward
one state to another through a single action, such as the action over the first n steps, n ∈ N as in Equation 2.
“disconnecting from vehicle” leads the transition from the
"N #
X
π n
state “connected to RSU and vehicle” to “connected to RSU”. vN (x, π, γ) = Ex γ r(xn , an ) (2)
Assuming an nc number of media available, we will have an n=0
Action Space Ai of 2 ∗ nc possible actions. In other words, we define the value function v(X) as the
The action space is reduced when considering the restrictive sum of all predicted future rewards, implementing temporal
transition of communication conditions. Thus, given an state discounting with a gamma parameter where rewards are at k
xi of vehicle i, there is a set Ai (x) ⊆ Ai of available actions steps in the future are weighted by an exponential discount
that lead to states where a single transited state is xi ∈ Xi ; factor γ k . The value function turns in a weighted sum,
3) Transition Probabilities: Employing a randomized described in Equation 2.
model, for each each action Ai (x), there is a transition prob-
ability on x. These probabilities are assumed to represent the B. Connectivity Estimation
chances the connection status may change, being conditioned Assuming an infinite horizon when utilizing the reward
through a recent past. In the model, the transition probabilities function, described in Equation 2, we employ Bellman Equa-
may be assigned or adjusted through statistical analysis, such tion [26] to define an state value function in state x ∈ X for
as time series, following recorded past transitions. a stationary policy π, as described in Equation 3.

217
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Algorithm 1: Connectivity Status Estimation their MDP transition and reward matrices, individually. The
Data: Xi ; Pai ; R; Ai ; γ;  optimal policy search algorithm can be executed on event basis
Result: πi ; v ∗ or periodically. In this work implementation, we employed a
1 V = 0; πi = 0;
2 do
event-oriented triggering of the search whenever an update
3 ∆ = 0; occurs on the matrices. The connectivity usually relates to a
4 for x ∈ X do reference point, which is the point of attachment in this work.
5 Av = 0;
6 for a ∈ A do
Thus, due to this assumption, this point of attachment, which
7 xn = A(x); can be an RSU, tower, or VC Management Unit, also becomes
8 Av[a] = Pa [x][xn ] ∗ (R[xn ] + γ ∗ V [xn ]); a reference for devices and other vehicles to contact service
9 avbest = max(Av); providers. Vehicles notify their respective management unit,
10 ∆ = max(∆, |avbest − V [x]|); which then classifies them. The unit finally matches incoming
11 V [x] = avbest ;
12 πi [x] = argmax(Av); service requesters based on the ranking it updated.
13 while ∆ < ; V. P ERFORMANCE A NALYSIS
We have conducted simulation experimental analysis to
evaluate the performance of the proposed connectivity-oriented
model. The simulations aimed to represent realistic urban
center scenarios, requiring simulation of communication, as
X
V π (X) = r(x, π(x)) + γ p(y|x, π(x))V π (y) (3)
y well as the mobility, of vehicles in a targeted intelligent
Following the Principle of Optimality of Bellman [26], we transportation environment.
establish an inductive greedy search process where takes as A. Scenario
basis on an initial state and decision, identifying an optimal The experimental scenario is completely built using
policy in the subsequent decisions/actions. Thus, the principle Veins [27], with the support of Omnetpp++ [28] and
of optimality applied over the value function gives the optimal SUMO [29]. The simulator Veins contains general protocols
value function V ∗ = maxπ V π , represented in Equation 4. and modeling capabilities of networking or wireless protocols
" #

X

for communication of the nodes. SUMO brings a microscopic
V (x) = max r(x, a) + γ p(y|x, a)V (y) (4) mobility traffic simulation of the nodes in Omnet++. Veins
a∈A
y
facilitate and keeps the consistency between these two later
In Equation 5, the optimal policy from a given state x simulators, allowing the basis for modeling and simulating
follows a similar representation of the optimal value function. Vehicular Networks.
"
X
# The whole ITS scenario stands on WAVE as the supporting
∗ ∗
π (x) = arg max r(x, a) + γ p(y|x, a)V (y) (5) V2V communication means. Vehicles can “connect” to the
a∈A y Internet by communicating through both V2V and V2I. Thus,
Equations 4 and 4 can then be translated to a Value-iteration RSUs are also present in the simulation scenario. Relying on
algorithm where it iteratively searches for the best policy given the IEEE 802.11p standard, the V2V communication mode fol-
transition and reward matrices. The value-iteration search is lows WSMP, in which OFDM guarantees different data rates.
summarizes in Algorithm 1. According to the algorithm the We thus assume proper, organized one-hop communication
search ends when a error  is satisfied. Both discount γ and over multiple channels through Control Channel (CCH).
error  condition the convergence and number of k iterations 1) Traffic Network Topology: For the sake of realism, we
in which the algorithm runs, k = log(r max /)
log(1/γ) . The processing
use a real-world urban scenario where vehicles present high-
to identify the current suitable pseudo-optimal MDP policy of mobility displacement patterns. We use a slice of Cologne
our model is conducted for each vehicle. metropolitan area as our urban center, which adopts a traffic
simulation data set scenario for bringing more realistic mo-
C. VC Management bility. The region represents a dense urban area of 1x1km2 ,
The Vehicular Cloud Management then fundamentally em- as depicted in Figure 2. Most of the map follows a standard
ploys the proposed model in order to differentiate vehicles grid layout, but some segment clusters form non-conventional
based on the quality and stability of their point of attachment to layouts. The urban map also shares parts with highways,
the network. The management can follow many architectures allowing a wider range of speeds and mobility patterns.
where centralized, hierarchical, or fully distributed service 2) Parameters: To delimit and constrain our experimental
orchestrators allocated resources dynamically. In our envi- scenarios, we defined combinations of parameter settings.
sioned scenario, the a set of VC Management Units serves as Table I summarizes all parameters used in our simulations.
reference points in which nodes, mobile devices, pose service In simulations, the speed of vehicles range between 5 and
requests that match up with surrounding (in terms of network) 15 m/s to resemble to usual urban centres. Such speeds
vehicles. The whole scheme works in 3 phases: monitoring, thus condition a high dynamicity in our observed scenarios
calculation, and advertisement. In fully distributed fashion, where vehicles might changes their connectivity statuses quite
vehicles monitor their own current connections and update frequently in a short span of time. Also, our scenarios adopted

218
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

in the state, ”V2I”, which means the vehicle is connected to


the RSU, and the vehicle can also maintain that connection.
TABLE I
S IMULATION PARAMETER S ETTINGS
Parameter Value Range
Urban area 1000x1000m2
Vehicle density 500 − 2000
Vehicle Speed 5 − 15m/s
RSU density 1
PHY model IEEE 802.11p
Vehicle comm. range 400m
RSU comm. range 400m
Transmission power 30mW
γ 0.1, 0.5, 1

The Ranking metric is used to determine which vehicle


Fig. 2. Cologne Metropolitan area used in the simulation analysis. can stay in constant contact even if the hops are increased.
different densities of vehicles, which ranges from 500 to If the vehicle stays connected to the RSU any amount of
2000 vehicles simultaneously moving in the simulated area. data can be transferred to the vehicle and can be retrieved
The density directly impacts mobility and connectivity, where as well. The model picks the best candidate for that, and
sparse scenarios might lead to higher disconnection rates and this Ranking metric hence proves to be more useful. We
high-density scenarios might allow vehicles to be connected can also compare the best top candidates with the lower
quite often by either V2I or V2V. Observing both boundaries worst candidates. Selecting the most ”successful” candidates
in the density spectrum allows us to tell if the propose model to distribute the data will result in the faster delivery times.
is capable to capture the respective connectivity conditions. The situation could also change, maybe the lower candidates
become top as vehicles get closer to RSU, which is also why
For these particular simulation scenarios, a single RSU is
we used ranking system as our performance metric.
deployed for the whole urban region. Consequently, during the
The ratio of the most ”successful” connections per number
simulation vehicles are necessarily out of range of the RSU,
of vehicles is another metric observed. After being ranked, two
resorting to V2V communication or even being disconnected.
sets of vehicles are contacted. The n vehicles with the highest
For the simulations, the RSU is place at the bottom left
ranks compose the top set while the n vehicles with the lowest
quadrant of the region depicted in Figure 2.
ranks form the bottom set. We observe if the connections
Besides the communication ranges adopted in our simula-
with the RSU persist or vehicles are reachable through V2V
tions, the transmission of vehicles and RSU are susceptible to
after they have reported their connectivity statuses through the
propagation, attenuation, and collision issues that might occur,
sent messages. This metric helps establish an understanding of
according to the network simulator. Such issues are expected
connectivity ratio in a high mobility environment.
and needed so that the proposed model can properly represent
dynamic conditions while estimating connectivity statuses. C. Results
Since the model is supposed to adapt to changes, the Our metric is based on hops required to reach RSU. We
transition matrices are periodically adjusted according to the group the vehicles by the number of hops, as shown in the
new communication conditions of each vehicle. At the current graphs. We find the maximum, minimum, and the average
implementation, the RSU is the entity responsible for main- values assigned to each vehicle by the model from each group
taining such matrices, where each vehicle periodically reports and draw a bar graph. The y-axis represents the hops or
to the RSU its current connectivity status. The new status ”Reachability” and x-axis represents the maximum, minimum,
then drives the RSU to update transition probabilities so that and average values of the vehicle assigned by the model or in
they more precisely represent the connectivity behaviour of our case ”Ranking”. We rank the vehicles based on the value
a vehicle. After some simulation time, the RSU has collected that is assigned by the model. Each group has the ”best”,
significant amount of data from vehicles, allowing it to ranking ”worst”, and the ”average” vehicle. In Figure 3c, simulations
the surrounding vehicles using the proposed model. were conducted with a discount factor of 1 and about 500
vehicles. The closer the vehicle to the RSU, the higher the
B. Performance Metric value is going to be. The group 1 is excepted to present high
We adopted Ranking as the main a performance metric values as those vehicles are only about 1 hop away from the
in these analyses. We rank individual vehicles based on the RSU. The figure clearly shows that vehicles 2 hops away,
model value and we compare, the most ”successful” and the named group 2, show considerably lower connectivity status
”worst” from our ranking. The most ”successful” will have when compared with vehicles 1 hop away. Vehicles in group
constant ”connectivity” to the RSU and the ”worst” has no 2 is just slightly better then the group 3, which includes all
”connectivity” or little ”connectivity” to the RSU. The model the vehicles that are 3 hops away.
presents three individual states: ”Disconnected”, ”V2V”, and Figure 4b has a discount factor of 0.5 and the density of the
”V2I”. The higher values get assigned to the vehicle, that is vehicles is 2000. The bar graph here has much higher values

219
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Max Max Max


Min Min Min
3 Avg 3 Avg 3 Avg
Reachability (# of hops)

Reachability (# of hops)

Reachability (# of hops)
2 2 2

1 1 1

0 2 4 6 8 10 12 14 16 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 0 5 10 15 20 25 30


Ranking Ranking Ranking
(a) γ = 0.1 (b) γ = 0.5 (c) γ = 1
Fig. 3. Ranking of vehicle connectivity on a 500-vehicle scenario.
Max Max Max
Min Min Min
3 Avg 3 Avg 3 Avg
Reachability (# of hops)

Reachability (# of hops)

Reachability (# of hops)
2 2 2

1 1 1

0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 0 5 10 15 20 25 30 0 10 20 30 40 50


Ranking Ranking Ranking
(a) γ = 0.1 (b) γ = 0.5 (c) γ = 1
Fig. 4. Ranking of vehicle connectivity on a 2000-vehicle scenario.
0.5 Top 0.5 Top 0.5 Top
Bottom Bottom Bottom

0.4 0.4 0.4


Success Rate

Success Rate

Success Rate

0.3 0.3 0.3

0.2 0.2 0.2

0.1 0.1 0.1

0.0 0.0 0.0


500 1000 2000 500 1000 2000 500 1000 2000
# of Vehicles # of Vehicles # of Vehicles
(a) γ = 0.1 (b) γ = 0.5 (c) γ = 1
Fig. 5. Ratio of successful connections.
because of the proximity of vehicles to the RSU and among of this vehicle increases. Vehicles are always moving around;
themselves. The higher density in this figure thus enable a as they either get close or far away from the RSU, the model
better connectivity on overall when compared to Figure 3c. assign the value and based on the ranking. Thus, vehicles
Since the density of the vehicles is really high, it results in might be selected as the “successful” candidate.
much higher values by the model, as some vehicles could These results show us the differences and similarities when
be really close to the RSU and that makes the model assign comparing the discount factor and density. Mostly, the dis-
the values higher. Also, the vehicles that are constantly in count factor does affect the density, which shows a lot more
”connectivity” state to the RSU also receive a higher value. relevance and impact on the model. Figures 3 and 5 show the
This behaviour is expected and shown in Figures 4b and 3c. discount factor effects in the each density. As the discount fac-
We also compare the density and how it affects the ranking tor approaches towards 1, it also increases the vehicle’s values
metric. Each of the figures are consisting of discount factor assigned by the model. Because the connection becomes more
and the density of the vehicles. As the density seems to be stable, the model assigns higher priority to vehicles that are
increasing, so does the group 1 increase in the numbers and only one hop away and with a discount factor of 1. Thus, the
maximum values. There is a possibility the initial vehicle was model generates an even higher priority for those vehicles.
2 hops away and as it gets closer to the RSU, changing its Figures 3 and 5 clearly show the impact of the density. As the
network distance to only 1 hop away. Consequently, the values number of vehicles increases, so does the connectivity value

220
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

as the number of vehicles could vary, possibly close to the [9] A. Boukerche, R. W. L. Coutinho, and X. Yu, “Lisic: A link stability-
RSU. As more vehicles are present in the scenario, their state based protocol for vehicular information-centric networks,” in Proc. of
the IEEE Int. Conf. on Mobile Ad Hoc and Sensor Systems, 2017, pp.
could move from V2V to V2I, resulting in a higher ranking. 233–240.
Lastly, the Success Rate of the connections is summarized [10] N. Tamani, B. Brik, N. Lagraa, and Y. Ghamri-Doudane, “On link
in Figures 5a, 5b, and 5c. It is expected that the top set shows a stability metric and fuzzy quantification for service selection in mobile
vehicular cloud,” IEEE Transactions on ITS, pp. 1–13, 2019.
higher success ratio than the bottom set. However, the bottom [11] C. Xu, W. Quan, A. V. Vasilakos, H. Zhang, and G.-M. Muntean,
vehicles showed higher ratios. The reason behind a much lower “Information-centric cost-efficient optimization for multimedia content
success rate of the top group is our scenario’s high mobility. delivery in mobile vehicular networks,” Elsevier Computer Communi-
cations, vol. 99, pp. 93 – 106, 2017.
By the time contact messages are sent, vehicles have already [12] Y. Wang, J. Zheng, and N. Mitton, “Delivery delay analysis for roadside
left reach completely. Vehicles do not linger around the RSU, unit deployment in vehicular ad hoc networks with intermittent connec-
and they tend to leave; they either leave the simulated area tivity,” IEEE Transactions on Vehicular Technology, vol. 65, no. 10, pp.
8591–8602, 2016.
(map), the range of the RSU, or range of other vehicles. On the [13] S. Mehar, S. M. Senouci, A. Kies, and M. M. Zoulikha, “An optimized
other hand, vehicles that were ranked low but moving towards roadside units (rsu) placement for delay-sensitive applications in vehic-
the RSU were within its range when contacted. ular networks,” in Proc. of the Annual IEEE Consumer Communications
and Networking Conference, 2015, pp. 121–127.
VI. C ONCLUSION [14] J. Ren, Y. Zhang, K. Zhang, and X. Shen, “Exploiting mobile crowd-
sourcing for pervasive cloud services: challenges and solutions,” IEEE
In this paper, we have proposed a new MDP-based model Comm. Magazine, vol. 53, no. 3, pp. 98–105, 2015.
for estimating and representing the connectivity level in vehic- [15] R. I. Meneguette, A. Boukerche, and R. E. De Grande, “SMART: an
efficient resource search and management scheme for vehicular cloud-
ular networks. The connectivity model optimistically identifies connected system,” in Proc. of IEEE GLOBECOM, 2016, pp. 1–6.
the “best” vehicle candidate in a scenario where the access to [16] R. I. Meneguette and A. Boukerche, “Servites: An efficient search and
vehicles permeates the delivery of services and resources in allocation resource protocol based on V2V communication for vehicular
cloud,” Elsevier Computer Networks, vol. 123, pp. 104–118, 2017.
a Vehicular Cloud fashion. We have evaluated the proposed [17] R. I. Meneguette, A. Boukerche, A. H. M. Pimenta, and M. Meneguette,
model through simulations, which showed that the ranking “A resource allocation scheme based on semi-markov decision process
properly represents the connectivity conditions of vehicles. for dynamic vehicular clouds,” in Proc. of the IEEE Int. Conf. on
Communications, 2017, pp. 1–6.
As future work, we will study connection intermittency in [18] A. M. Mustafa, O. M. Abubakr, O. Ahmadien, A. Ahmedin, and
VANET scenarios where the quality of links are measured B. Mokhtar, “Mobility prediction for efficient resources management in
throughout the delivery of services. Frequency and availability vehicular cloud computing,” in Proc. of the IEEE Int. Conf. on Mobile
Cloud Computing, Services, and Engineering, 2017, pp. 53–59.
time will be incorporated into the devised MDP-based uncer- [19] I. Jabri, T. Mekki, A. Rachedi, and M. B. Jemaa, “Vehicular fog
tainty model so that they can more accurately characterize gateways selection on the internet of vehicles: A fuzzy logic with ant
the connectivity patterns vehicles may run into while they colony optimization based approach,” Ad Hoc Networks, p. 101879,
2019.
move around urban centres or are in stationary scenarios. This [20] F. Zhang, R. E. De Grande, and A. Boukerche, “Macroscopic interval-
extended model is intended to introduce a general representa- split free-flow model for vehicular cloud computing,” in Proc. of
tion that may specified and self-adapt according to the recent the IEEE/ACM Int. Symp. on Distributed Simulation and Real Time
Applications, 2017, pp. 1–8.
behavior of VC nodes. [21] C. Lin, D. Deng, and C. Yao, “Resource allocation in vehicular cloud
R EFERENCES computing systems with heterogeneous vehicles and roadside units,”
IEEE Internet of Things Journal, vol. 5, no. 5, pp. 3692–3700, 2018.
[1] R. Florin and S. Olariu, “Toward approximating job completion time [22] M. Sookhak, F. R. Yu, Y. He, H. Talebian, N. S. Safa, N. Zhao,
in vehicular clouds,” IEEE Transactions on ITS, vol. PP, pp. 1–10, 11 M. K. Khan, and N. Kumar, “Fog vehicular computing: Augmentation
2018. of fog computing using vehicular cloud computing,” IEEE Vehicular
[2] A. Boukerche and R. E. De Grande, “Vehicular cloud computing: Technology Magazine, vol. 12, no. 3, pp. 55–64, 2017.
Architectures, applications, and mobility,” Elsevier Computer Networks, [23] A. Ashok, P. Steenkiste, and F. Bai, “Vehicular cloud computing through
vol. 135, pp. 171 – 189, 2018. dynamic computation offloading,” Elsevier Computer Communications,
[3] J. A. F. F. Dias, J. J. P. C. Rodrigues, F. Xia, and C. X. Mavromous- vol. 120, pp. 125 – 137, 2018.
takis, “A cooperative watchdog system to detect misbehavior nodes [24] K. K. Ghanshala, S. Sharma, S. Mohan, L. Nautiyal, P. Mishra, and R. C.
in vehicular delay-tolerant networks,” IEEE Transactions on Industrial Joshi, “Self-organizing sustainable spectrum management methodology
Electronics, vol. 62, no. 12, pp. 7929–7937, 2015. in cognitive radio vehicular adhoc network (cravenet) environment: A
[4] Y. Bi, H. Zhou, W. Xu, X. S. Shen, and H. Zhao, “An efficient pmipv6- reinforcement learning approach,” in 2018 First Int. Conf. on Secure
based handoff scheme for urban vehicular networks,” IEEE Transactions Cyber Computing and Communication, 2018, pp. 168–172.
on ITS, vol. 17, no. 12, pp. 3613–3628, 2016. [25] Y. He, F. R. Yu, Z. Wei, and V. Leung, “Trust management for secure
[5] P. Dhingra and P. C. Jain, “Cost-effective vertical handoff strategies in cognitive radio vehicular ad hoc networks,” Ad Hoc Networks, vol. 86,
heterogeneous vehicular networks,” in Proc. of the Springer Int. Conf. pp. 154 – 165, 2019.
on Advanced Computational and Communication Paradigms, 2018, pp. [26] C. Sammut and G. I. Webb, Eds., Bellman Equation, Boston, MA, 2010,
369–377. pp. 97–97.
[6] A. K. Niari, R. Berangi, and M. Fathy, “Eccn: an extended ccn architec- [27] C. Sommer, R. German, and F. Dressler, “Bidirectionally Coupled
ture to improve data access in vehicular content-centric network,” The Network and Road Traffic Simulation for Improved IVC Analysis,” IEEE
Journal of Supercomputing, vol. 74, no. 1, pp. 205–221, 2018. Transactions on Mobile Computing, vol. 10, no. 1, pp. 3–15, 1 2011.
[7] F. H. Rahman, A. Y. M. Iqbal, S. H. S. Newaz, A. T. Wan, and M. S. [28] A. Varga and R. Hornig, “An overview of the omnet++ simulation
Ahsan, “Street parked vehicles based vehicular fog computing: Tcp environment,” in Proc. of the Int. Conf. on Simulation Tools and
throughput evaluation and future research direction,” in 2019 21st Int. Techniques for Communications, Networks and Systems & Workshops,
Conf. on Advanced Communication Technology, 2019, pp. 26–31. 2008, pp. 60:1–60:10.
[8] R. W. L. Coutinho, A. Boukerche, and X. Yu, “Information-centric [29] P. A. Lopez, M. Behrisch, L. Bieker-Walz, J. Erdmann, Y.-P. Flötteröd,
strategies for content delivery in intelligent vehicular networks,” in Proc. R. Hilbrich, L. Lücken, J. Rummel, P. Wagner, and E. Wießner,
of the ACM Symp. on Design and Analysis of Intelligent Vehicular “Microscopic traffic simulation using sumo,” in Proc. of the 21st IEEE
Networks and Applications, 2018, pp. 21–26. Int. Conf. on Intelligent Transportation Systems, 2018.

221
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

A New Mobility Samples Encoding Scheme Based


on Pairing Functions and Data Analytics
Peppino Fazio† , Miralem Mehic∗ † , Pavol Partila† , Jaromir Tovarek† , Miroslav Voznak†
∗Department of Telecommunications, Faculty of Electrical Engineering, University of Sarajevo,
Zmaja od Bosne bb, 71000, Sarajevo, Bosnia and Herzegovina
† VSB – Technical University of Ostrava, 17. listopadu 2172/15, 708 33 Ostrava, Czechia
† peppino.fazio@vsb.cz, † miralem.mehic@ieee.org, † pavol.partila@vsb.cz, † jaromir.tovarek@vsb.cz, † miroslav.voznak@vsb.cz

Abstract—In the modern telecommunication systems, mobility We base our approach on the Pairing Functions (PFs) [6–
is one of the key advantage of wireless communications, given that 10], applying them to the set of collected coordinates. In fact,
it is possible to transmit/receive data, without caring of having if we assume that the location of a moving node is represented
a static position into the network. Of course, mobility poses
special issues such as degradations, channel quality fluctuations, by its spatial coordinates (such as a couple of values in a 2D
fast topology changes, and so on. Modern researches focus space, or a triplet in a 3D space), then a new way to represent
their attention on predicting mobile future node positions, in them with a single value can be proposed: each prediction
order to a-priori know, for example, what the evolution of the can be made by considering only one evaluation of the next
network topology will be or which level of stability each node position, without taking into consideration the variations of
will reach. Each prediction scheme is based on the storage and
analysis of several historical mobility trajectories, in order to the single coordinates separately.
train the proper prediction algorithm. In this paper, we focus We evaluate different pairing functions and, for all of them,
our attention on the optimization of the space needed to store we consider the magnitude of the encoded samples (codomain)
historical mobility samples, encoding their values and evaluating and the error committed for decoding back (unpairing) the
the conversion error, comparing different encoding functions. original values. PFs have been widely used in information
Several simulation campaigns have been carried out in order
to evaluate the goodness and feasibility of our proposal. security systems [11, 14] and, in this work, instead, we apply
Index Terms—Mobile Networking, Mobility, Prediction, Train- them to a completely different research topic. Without loss
ing, Pairing functions, Sampling. of generality, we carry out our analysis on mobility records
composed only by GPS coordinates (or some representations
I. I NTRODUCTION of them), without considering any additional feature (such as
In the last years, data analysis for prediction purposes has social relations, points of interests, location-based social net-
been one of the main research activities carried out in a works, etc.). In this way, the proposed approach is completely
very wide variety of scientific communities [1–5, 12, 13]: the general and can be enhanced, if needed, by considering the
general term used to indicate such kind of activity is data ana- relevant issue, based on the considered scenario.
lytics. To this aim, it is very important to collect data from real The paper is organized as follows: Section II introduces
world processes (nodes mobility, financial tendencies, network some recent works about mobility and trajectory analysis,
performance, urban planning, epidemic control, location-based while Section III gives a deeper description of the proposed
services, and intelligent transportation management, etc.), to idea, under a theoretical point of view. Section IV gives a
analyse them from a statistical/stochastical point of view and deep description of the main reachable results, and section V
to implement a predictive algorithm, able to forecast future concludes the paper.
values of the observed process. One of the main issues in II. S TATE OF THE A RT
such kind of approach is the creation of historical log-files and
their storage in digital formats. In this paper we are interested Mobility analysis has been an extensive research activity
at studying and analysing mobile nodes behavior and, in in the last decades. The a-priori knowledge of the future
particular, at proposing a new way for storing their sampled positions of mobile hosts attracts the attention of many re-
values, in order to encode them and gain storing space. In searchers, who are interested at planning their systems on
fact, mobility is generally studied by considering the processes the basis of the future trend of the considered process. The
related to the single coordinates (expressed in Cartesian terms procedure is always the same: a) historical data collection;
or GPS values) separately [15, 16]: this implies an uncorrelated b) data analysis; c) implementation of a predictive model;
study of the trajectories and a wastage in the storing space, due d) training and prediction. For example, in [18] a novel
to the needing of creating historical trace-files (or log-files). prediction scheme is proposed, based on the management
of smartphones data (location, schedule, e-mail information,
etc.); the authors, after the illustration of the importance of
978-1-7281-7343-6/20/$31.00 2020
c IEEE big data management, show the way the data over more than

222
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

one year has been collected and, based on it, they demonstrated to be made, and once the next-value is obtained, it can be
that the proposed scheme can predict user location precisely, decoded back into the m original components. To do this,
giving to mobile users some enhanced services (about location, we based our approach on the Pairing Functions (PFs) [6, 7].
torrential rain, train delays, traffic jams, etc.). The work Now the concept of PF is briefly explained and, then, some
in [19] is strictly related to location prediction, Points of PFs are introduced for encoding the content of the trace files.
Social Interactions (PSIs) and Points of Interest (POIs). The A Pairing Function (PF) is defined on a particular domain
authors compared the last two aspects together with a two Dom and it encodes each couple (pair) of elements from Dom
steps PSI model and a two-stage POIs clustering approach, to to a single element of Dom: any two distinct couples will
reduce the effects of randomness and to improve the overall be represented with two distinct elements (in this way it is
performance of the prediction scheme. The paper illustrates possible to decode back the original values when needed). A
several results, by which it can be understood how the PSI PF is generally indicated as a function pf : Domm → Dom,
approach outperforms other predictive algorithms. In [20] and they are used in a wide variety of applications (renderers,
a recent overview of different methods and approaches for shaders, theoretical computer science, etc.). We will indicate
predicting mobile trajectories, basing the choice of next places with pf −1 : Dom → Domm the inverse PF function to
on mobility data. The paper, after an interesting introduction, decode back the m values (it is also called unpairing function).
describes the basic concepts of location prediction, including Many PFs have been defined in literature [8]: their study and
the different sources of trajectory data, the general prediction evaluation are out of the scope of this paper, while the main
framework, challenges in location prediction, and common aim of this sub-section consists in the application of some PFs
trajectory data preprocessing methods. The authors of [22] to encode mobility traces. Cantor’s PF is the most known [8],
underline the importance of analysing human mobility, as well defined as a bijection N2 → N:
as implementing a predictive approach. The work underlines, (x + y) · (x + y + 1)
at the same time, the heterogeneity of mobility nodes, because pfCantor (x, y) = +y (1)
2
nowadays they consist of handheld terminals, GPS, vehicular
nodes, sensors, social media based nodes, etc. The authors but it has been demonstrated that it has some limitations
survey several approaches for characterizing human mobility in terms of value packing efficiency. For example, if we set
patterns from individual, collective, and hybrid levels. In [21], x = 9 and y = 7 we would expect to obtain a maximum of 80
the authors, face the issue of sparse individual trajectory as a result (given that two digits 0-9 and 0-7 can create only 80
data, which often results in a high error of prediction results. combinations), but pfCantor (9, 7) = 143, with an efficiency of
The proposed scheme is called Individual Trajectory-Group only 56%. This result can be drastically improved by Szudzik’s
Trajectory (ITGT), and it is based on the pattern created by PF, also defined as a bijection N2 → N (it is indicated also as
group travels. Different stages are considered, starting from Elegant Pairing Function):
a stay point extraction with spatial clustering, and different
(
x + y2 x<y
Markov models (PPM and PST) are then exploited to predict pfSzudzik (x, y) = 2
(2)
x +x+y x≥y
the clustering link. A massive amount of real data points
have been used, and the obtained results confirmed authors with pfSzudzik (8, 8) = 97 (efficiency is increased to
expectations, with an accuracy of almost 90%. 82.4%).
To the best of our knowledge and from the reading of For the PFs in eq. 1 and eq. 2 the unpairing functions are
the most recent papers on mobility prediction (as the ones defined as follows:
described before), no works are focusing on pairing and (
unpairing mobility coordinates, in order to simplify the imple- −1 a − i·(i+1) x
pfCantor (a) = i·(3+i) 2 (3)
mentation of a predictive approach. So, our main contribution 2 −a y
consists in proposing a novel approach able to simplify the
where
representation of mobility samples and to reduce the time
complexity of the predictive algorithms. In the next section √
−1 + 1 + 8a
the main idea is illustrated. i=b c (4)
2
III. PAIRING F UNCTIONS , M OBILITY SAMPLES PAIRING and
AND U NPAIRING ( √ 2
−1 a − b ac x
In this section, the proposed idea is described. First of all pfSzudzik (a) = √ (5)
the concept of pairing function is described, then it is applied b ac y
to mobility in dynamic networks. We focused our attention on if x < y, or
2D mobility samples (but the proposal can be easily extended ( √
to a 3D environment), proposing a way to encode a sequence b ac x
−1
of m samples (each sample is represented by a couple of pfSzudzik (a) = √ 2 √ (6)
a − b ac − b ac y
coordinates x and y) into only one value: in this way, only
one trace is needed to be analyzed, only one prediction needs else.

223
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Once the PFs are defined, we have to verify if and how they • Mobility trace values can be quantized: given a geo-
can be adapted to our scopes (mainly the encoding of mobility graphical region in which nodes are moving, it is easy
samples. to derive the minimum and maximum extension of xn (t)
To this aim, let us consider a generic trace-file tf , containing and yn (t) and, after the sampling operation, values can be
mobility values belonging to R2 (the approach can be easily quantized, after setting a proper resolution and assigning
generalized if mobility coordinates belong to R3 , in general integer values;
to Rm ). That is to say, we consider historical mobility values • Samples approximation: depending on the used mobility
stored as couples (in the case of a 3D space they are stored as format (GPS, planar coordinates, etc.), the decimal part
triplets). Let us define xn (t) as the value of the x coordinate can be neglected, or any value can be transformed into
of user n at time t. The same definition can be given for an integer one;
the y coordinate and for yn (t). They are continuous functions Mobility traces often contain negative values which do not
of time. We assume that the mobile network (or, directly, the belong to N; they can be converted into integer ones, but no
mobile node n) is able to store xn (t) and yn (t) each T seconds operations can be made on the sign. Also, in this case, we
(sampling period): we indicate with Xn (kT ) and Yn (kT ) the have some solutions:
discretized versions of xn (t) and yn (t), where k is a positive • x and y values can be translated to move the origin of
and integer value (for k = 0 the sampling operation is started). the reference system;
In this sense, the terms Xn and Yn can be considered as • PF functions can be transformed to account for negative
random variables, both defined on the space Ω ≡ R. After the integers.
collection of mobility samples, the vectors X ~ n (T ) and Y
~n (T )
~ n (T )|| = ||Y
~n (T )|| = N . Clearly, as In particular, negative numbers are taken into account by
are obtained, with ||X
applying the following transformation before the evaluation
said before, it is also possible to extend the analysis to the
of the chosen pf:
third variable z. Most of the existing works do not account (
for the intrinsic correlation between the spatial component of −2x − 1 x<0
a 2D space (we consider the dimensions up-to R2 ). In this c= (7)
2x x≥0
paper, instead, we propose to study the mobility coordinates by (
considering also their intrinsic relationship, so encoding them −2y − 1 y<0
in one value at each sampling period. In fact, a node moves d= (8)
2y y≥0
by respecting the environmental constraints; so, analyzing the
individual coordinate process, independently from the other and evaluating pf (c, d).
ones, leads to the definition of some models which may leak In this paper we are considering Cantor’s function because
some precious information. Clearly, all the equations defined it is the most known in literature, while Szudzik’s function is
before are still valid for all the other mobility components. one of the most efficient PF. As said in the previous section,
In the next section, some numerical results are obtained, also 3D mobility environments [17] can be considered: given
showing the possible reachable results which can be reached x, y and z coordinates, we can evaluate x0 = pf (x, y) and
by considering the application of PFs to mobility samples. y 0 = pf (x0 , z) so the stored value will be only y 0 (in literature,
this is also defined recursively by writing pf [x, pf (y, z)]). In
IV. S IMULATION RESULTS AND ANALYSIS order to apply the concepts related to PFs as in the previous
section, the values of the trace-files should belong to N (or to
To test and verify the concepts illustrated before, a MAT- Z if negative values will be taken into account). In general,
LAB testbed has been setup. Different functions have been the content of the downloaded files contains real values, so
defined for analyzing and characterizing the downloaded data we decided to transform them into integer values by finding a
in terms of mobility samples. In particular, we considered proper multiplying factor; in this way, only integer values have
the datasets in [26], consisting of human mobility traces in been considered, while negative numbers have been avoided
GPS format from five different sites: two university campuses by equations 7, 8.
(NCSU and KAIST), New York City, Disney World (Orlando), In the case of pedestrian traces [27], each row of the
and North Carolina Raleigh (during the state fair event). trace files is simply formatted as ID, time(24hf ormat),
We referred to the previous traces, for which an observation latitude, longitude and Tpedestrian = 50ms. Figures 1,
window of 30s has been considered, that is to say, each 2 and 3 illustrate some examples of pedestrian patterns for
sample collection activity has a global duration of 30s. For NCSU, Disney and KAIST scenario respectively (a total of
the KAIST traces, the average number of samples N ∗ is 1608, 100 samples for each trace): in the upper part, the trends of x
for the NCSU traces N ∗ =1431, for the NY traces N ∗ =1600, and y in function of the discrete time k are shown (T = 30s).
for the Orlando, traces N ∗ =1284 and for the North Carolina The complete pattern in a 2D space can be observed at the
traces N ∗ =415. Given that PFs consider only the N as domain bottom of the figures. If we apply Cantor’s and Szudzik’s
and codomain, mobility samples need to be transformed (PFs pairing functions to the above illustrated mobility patterns we
are polynomial functions, and no continuous bijections are obtain the trends illustrated in figures 4, 5 and 6 respectively.
possible for R2 and R [10]). In particular:

224
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

of the new curves is completely different (from 103 of the


original mobility to 105 − 106 of the paired version). PFs
are useful for storing mobility samples, they preserve the
correlation between x and y processes (in general also for
the z coordinate), but the main strength is represented by
the reduced time complexity in analyzing only one process,
instead of two or three. On the other hand, the price is paid

Figure 1. The trend of x and y coordinates in the case of pedestrian


traces (NCSU). The bottom plot represents the complete pattern in a
2D space.

Figure 2. The trend of x and y coordinates in the case of pedestrian


traces (Disney). The bottom plot represents the complete pattern in
a 2D space. Figure 5. The trend of the paired values of Fig. 2 (Disney).

Figure 3. The trend of x and y coordinates in the case of pedestrian


traces (KAIST). The bottom plot represents the complete pattern in
a 2D space.
Figure 6. The trend of the paired values of Fig. 3 (KAIST).

in terms of the needed space to store the paired numbers (more


bits are necessary). Once the samples have been encoded,
the next step is to apply a data analytic scheme in order to
predict the next paired sample and, then, unpair it to obtain
the future coordinates x and y (or also z). In this section we
will evaluate the feasibility of the proposed scheme in terms
of pairing, predicting and unpairing operations. Of course, we
are not focusing on a particular prediction model/algorithm,
but we want to show which are the performance of the
pairing functions as final result: so the samples have been: a)
parsed (from the log-file structure to MATLAB), b) paired (by
applying PFs), c) predicted (by a particular predictor, as shown
Figure 4. The trend of the paired values of Fig. 1 (NCSU). later), d) unpaired (by applying pf −1 to the predicted sample),
e) compared (in order to evaluate the prediction error). We
based our approach on the AR(p) models [23, 24], assuming
First of all, we can observe that for each couple of mobility that the mobility samples sequence can be considered as an
coordinates we obtain only one function (solid black for auto-regressive model with p = 1. For more details on mobility
pfCantor and dotted red for pfSzudzik ). Given that pairing predictions approaches, please refer to [25]. We trained an
functions, in general, contain power operations, the order AR(1) predictor on the traces of figures 1, 2, 3, the following

225
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

predicted trends have been obtained. In particular, the first 30


samples have been used for training the predictor, while the
last 70 samples have been predicted: figures 7, 8 represent the
accuracy of the predicted values (indicated by the markers)
referred to the original samples (continuous lines). KAIST
samples have been not represented because their magnitude
is much different from the other two trends. It can be seen

Figure 9. Unpaired predicted x samples for the NCSU trace (last 70


samples).

Figure 7. The trend of the NCSU and Disney samples, paired by


Cantor’s PF and predicted by an AR(1).

Figure 10. Unpaired predicted y samples for the NCSU trace (last
70 samples).

Figure 8. The trend of the NCSU and Disney samples, paired by


Szudzik’s PF and predicted by an AR(1).

how the predicted samples are very close to the original paired
curve. At this point, the predicted values have been unpaired
and the obtained results have been compared with the original
values. Figures 9, 10, 11, 12 show the goodness of the Figure 11. Unpaired predicted x samples for the Disney trace (last
prediction and unpairing operations, for the last 70 samples 70 samples).
of each trace. It mostly depends on the predictor accuracy, but
it could be seen how the obtained values are very close to the
original trend (solid curve). approximation in reconstructing the original traces has been
V. C ONCLUSION AND F UTURE W ORKS also illustrated, showing the feasibility of the proposed idea.
This paper argues about the possibility to analyze nodes ACKNOWLEDGMENT
mobility under a different point of view: in particular, when
prediction operations need to be made, mobility can be rep- This work was supported by the Czech Ministry of Ed-
resented in a different way, by the integration of pairing ucation, Youth and Sports from the Large Infrastructures for
functions, able to represent the different coordinates (x, y and Research, Experimental Development and Innovations project,
z) with only one value. Two pairing functions have been de- IT4Innovations National Supercomputing Center LM2015070
scribed and exploited to see how mobility can be represented. and partly by the institutional grant SGS reg. no. SP2020/65
Paired values have been obtained and predicted, while the conducted at VSB - Technical University of Ostrava.

226
2020 IEEE/ACM 24 ͭ ͪ International Symposium on Distributed Simulation and Real Time Applications (DS-RT)

Re-Routing Algorithm for CO2 Emissions Reduction,”


in IEEE Transactions on Vehicular Technology, Vol.68
(5),2019, pp.4419-4433.
[13] F. De Rango, P. Fazio, S. Marano, ”Utility-based predic-
tive services for adaptive wireless networks with mobile
hosts,” in IEEE Transactions on Vehicular Technology,
Vol. 58 (3), 2008, pp.1415-1428.
[14] B. H. Krishna, et al., ”Multiple text encryption, key
entrenched, distributed cipher using pairing functions and
transposition ciphers,” in International Conference on
Wireless Communications, Signal Processing and Net-
working (WiSPNET), 2016.
Figure 12. Unpaired predicted y samples for the Disney trace (last
[15] P. Yang, X. L. Hong Ji, H. Zhang, ”A novel mobility pre-
70 samples).
diction scheme for outdoor crowded scenario using Fuzzy
C-means,” in IEEE 28th Annual International Symposium
R EFERENCES on Personal, Indoor, and Mobile Radio Communications
(PIMRC), 2017.
[1] L. Ward, A. Agrawal, A. Choudhary, C.Wolverton, ”A [16] P. Rathore, D. Kumar, S. Rajasegarar, M. Palaniswami, J.
general-purpose machine learning framework for predict- C. Bezdek, ”A Scalable Framework for Trajectory Predic-
ing properties of inorganic materials,” in Computational tion,” in IEEE Transactions on Intelligent Transportation
Materials, Vol. 2, 2016. Systems, Vol. 20, Issue 10, pp. 3860-3874, 2019.
[2] M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, [17] S. U. Rahman, et al., ”Positioning of UAVs for through-
N. Seliya, R. Wald, E. Muharemagic, ”Deep learning put maximization in software-defined disaster area UAV
applications and challenges in big data analytics,” in communication networks,” in Journal of Communications
Journal of Big Data, Vol. 2, 2015. and Networks, Vol. 20, Issue 5, pp. 452-463, 2018.
[3] G. Xu, S. Gao, M. Daneshmand, C. Wang, Y. Liu, ”A [18] N. Yamada, et al., ”Location prediction based on Smart-
Survey for Mobility Big Data Analytics for Geolocation phone Multimodal Personal Data for Proactive Support
Prediction,” in IEEE Wirel. Comm., Vol. 24, Issue 1, 2017. Services,” in Eleventh International Conference on Mobile
[4] P. Fazio, F. De Rango, A. Lupia, ”Vehicular networks Computing and Ubiquitous Network (ICMU), 2018.
and road safety: An application for emergency/danger [19] R. Wu, et al., ”Learning Individual Moving Preference
situations management using the WAVE/802.11p stan- and Social Interaction for Location Prediction,” in IEEE
dard,” in Journal on Advances in Electrical and Electronic Access, Vol. 6, pp. 10675-10687, 2018.
Engineering, Vol. 11, Issue 5, pp. 357-364, 2013. [20] R. Wu, et al., ”Location Prediction on Trajectory Data: A
[5] A. F. Santamaria, et al., ”Managing Emergency Situations Review,” in Big Data Mining and Analytics, Vol. 1, Issue
in VANET Through Heterogeneous Technologies Cooper- 2, pp. 108-127, 2018.
ation,” in Sensors 2018, Vol. 18, Issue 5. [21] F. Li, et al., ”A Personal Location Prediction Method
[6] B. H. Krishna, et al., ”Multiple text encryption, key Based on Individual Trajectory and Group Trajectory,” in
entrenched, distributed cipher using pairing functions and IEEE Access, Vol. 7, pp. 92850-92860, 2019.
transposition ciphers,” in International Conference on [22] J. Wang, et al., ”Urban Human Mobility: Data-Driven
Wireless Communications, Signal Processing and Net- Modeling and Prediction,” in ACM SIGKDD Explorations
working (WiSPNET), 2016. Newsletter, Vol. 21, Issue 1, pp. 1-19, 2019.
[7] M. P. Szudzik, ”The Rosenberg-Strong Pairing Function”, [23] J. Lee, ”Univariate time series modeling and forecasting
2019 v5 version, https://arxiv.org/abs/1706.04129. (Box-Jenkins Method),” in Econ 413, lecture 4.
[8] S. Wolfram, ”A New Kind of Science”, W. Media, 2002. [24] G.E.P. Box, et al., ”Time Series Analysis, Forecasting
[9] G. Cantor, ”Ein beitrag zur mannigfaltigkeitslehre,” in and Control,” in Holden-Day, San Francisco, 1970.
Journal für die reine und angewandte Mathematik, Vol. [25] P. Fazio, et al., ”Prediction and QoS Enhancement in
84, pp. 242-258, 1878. New Generation Cellular Networks with Mobile Hosts: A
[10] L.E.J. Brouwer, ”Beweis der Invarianz des n- Survey on Different Protocols and Conventional/Uncon-
dimensionalen Gebiets,” in Mathematische Annalen, ventional Approaches,” in IEEE Communications Surveys
Vol. 71, pp. 305-315, 1912. and Tutorials, Vol. 19, Issue 3, pp. 1822-1841, 2017.
[11] Y. Kanbara, T. Teruya, N. Kanayama, T. Nishide, E. [26] CRAWDAD, ”A Community Resource for Archiving
Okamoto, ”Software Implementation of a Pairing Function Wireless Data At Dartmouth,” http://crawdad.org.
for Public Key Cryptosystems,” in 5th International Con- [27] I. Rhee, et al., ”CRAWDAD dataset ncsu/mo-
ference on IT Convergence and Security (ICITCS), 2015. bilitymodels (v. 2009-07-23),” downloaded from
[12] A.F. Santamaria, P. Fazio, P. Raimondo, M. Tropea, F. De https://crawdad.org/ncsu/mobilitymodels/20090723,
Rango, ”A New Distributed Predictive Congestion Aware https://doi.org/10.15783/C7X302, Jul 2009.

227
List of Authors
Author Page(s) Author Page(s)
Aznar, Pablo 175 Horiguchi, Tatsuya 49
Azumi, Takuya 49 Iacovelli, Giovanni 190
Baron, Wojciech 115 Iaffaldano, Giuseppe 151
Boccadoro, Pietro 190 Ianni, Mauro 59
Boudjadar, Jalil 163 Igarashi, Shingo 49
Boukerche, Azzedine 84 Ishigooka, Tasuku 49
Bousbaa, Fatima 182 Jiménez-Bravo, Diego 37
Bout, Emilie 146 Kerrache, Chaker Abdelaziz 182
Braem, Bart 37 Khooban, Mohammad Hassan 163
Buchholz, Peter 41 Koike, Ryotaro 49
Cakmak, Hueseyin 16 Kuehnapfel, Uwe 16
Calafate, Carlos 175 Kyesswa, Michael 16
Campolo, Claudia 1 Lagraa, Nasreddine 182
Cano, Juan-Carlos 175 Lakas, Abderrahmane 182
Cheriguene, Youssra 182 Lalis, Spyros 198
Cicirelli, Franco 107 155 Loscrí, Valeria 146
Ciociola, Alessandro 206 Manzoni, Pietro 175
Cocca, Michele 206 Marcellan, Anna 25
De Grande, Robson 214 Marfia, Gustavo 142
De Rango, Floriano 77 Marilleau, Nicolas 100
Diallo, Moussa 100 Marotta, Romolo 59
Dias, João Pedro 92 Marquez-Barja, Johann 37
Djanatliev, Anatoli 115 Masala Mutombo, Pierre 37
Djellikh, Soumia 182 Maurya, Avinash 167
Donatiello, Lorenzo 142 Mehic, Miralem 222
Drasar, Martin 7 Mellia, Marco 206
Erdmann, Anselm 25 Molinaro, Antonella 1
Fabra, Francisco 175 Moskal, Stephen 7
Falcone, Alberto 137 Müller, Dirk 25
Fazio, Peppino 222 Mussini, Marco 151
Ferreira, Hugo 92 Nevigato, Nicolas 77
Gallais, Antoine 146 Ngom, Bassirou 100
Garro, Alfredo 137 Nicassio, Francesco 151
Gasparini, Lorenzo 142 Nicolae, Bogdan 167
Genovese, Giacomo 1 Nigro, Libero 107
Gentile, Antonio 155 Park, Sung woon 84
Giordano, Danilo 206 Parladori, Giorgio 151
Greco, Emilio 155 Partila, Pavol 222
Grewing, Christian 133 Pellegrini, Alessandro 59 68
Grieco, Giovanni 151 Piccione, Andrea 68
Grieco, Luigi Alfredo 190 Piro, Giuseppe 151
Grigoropoulos, Nasos 198 Pizzimenti, Bruno 1
Guan, Shichao 84 Potuzak, Tomas 123
Guerrieri, Antonio 155 Puzicha, Alexander 41
Guliani, Ishan 167 Quaglia, Francesco 59
Gütlein, Moritz 115 Rab, Maryan 59
Hagenmeyer, Veit 16 25 Rafique, M. Mustafa 167
Henke, Martin 25 Renner, Christopher 115
Hering, Dominik 25 Restivo, André 92
List of Authors
Author Page(s) Author Page(s)
Robens, Markus 133 Tovarek, Jaromir 222
Saad, Abubakar 214 Triggiani, Francesco 151
Serianni, Abdon 33 Tropea, Mauro 33 77
Shah, Awais 151 Ulbrich, Carolin 25
Schiek, Michael 133 van Waasen, Stefan 133
Schlatmann, Rutger 25 Vassio, Luca 206
Schmurr, Philipp 16 Vinci, Andrea 155
Spezzano, Giandomenico 155 Voznak, Miroslav 222
Suriyah, Michael 25 Wubben, Jamie 175
Suslov, Sergey 133 Xhonneux, André 25
Tahari, Abdou El Karim 182 Yang, Shanchieh 7
Torres, Diogo 92 Zaťko, Pavol 7

You might also like