Sensors 21 01121 With Cover

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

3.4 7.

Article

From Data to Actions in Intelligent


Transportation Systems: A
Prescription of Functional
Requirements for Model
Actionability

Ibai Laña, Javier J. Sanchez-Medina , Eleni I. Vlahogianni and Javier Del Ser

Special Issue
Developing “Smartness” in Emerging Environments and Applications with Focus on the Internet of
Things (IoT)
Edited by
Prof. Dr. Rashid Mehmood, Prof. Dr. Juan M. Corchado and Prof. Dr. Tan Yigitcanlar

https://doi.org/10.3390/s21041121
sensors
Article
From Data to Actions in Intelligent Transportation Systems:
A Prescription of Functional Requirements for
Model Actionability
Ibai Laña 1, * , Javier J. Sanchez-Medina 2 , Eleni I. Vlahogianni 3 and Javier Del Ser 1,4

1 TECNALIA, Basque Research & Technology Alliance (BRTA), P. Tecnologico Bizkaia, Ed. 700,
48160 Derio, Spain; javier.delser@tecnalia.com or javier.delser@ehu.eus
2 CICEI, Department of Computer Science, University of Las Palmas de Gran Canaria, 35001 Las Palmas, Spain;
javier.sanchez@uplgc.es
3 Department of Transportation Planning and Engineering, National Technical University of Athens,
15780 Zografou, Greece; elenivl@mail.ntua.gr
4 Department of Communications Engineering, University of the Basque Country UPV/EHU,
Alameda Urquijo S/N, 48013 Bilbao, Spain
* Correspondence: ibai.lana@tecnalia.com

Abstract: Advances in Data Science permeate every field of Transportation Science and Engineering,
resulting in developments in the transportation sector that are data-driven. Nowadays, Intelligent
Transportation Systems (ITS) could be arguably approached as a “story” intensively producing and
consuming large amounts of data. A diversity of sensing devices densely spread over the infras-
tructure, vehicles or the travelers’ personal devices act as sources of data flows that are eventually
fed into software running on automatic devices, actuators or control systems producing, in turn,

 complex information flows among users, traffic managers, data analysts, traffic modeling scientists,
Citation: Laña, I.;
etc. These information flows provide enormous opportunities to improve model development and
Sanchez-Medina, J.J.; decision-making. This work aims to describe how data, coming from diverse ITS sources, can be used
Vlahogianni, E.I.; Del Ser, J. From to learn and adapt data-driven models for efficiently operating ITS assets, systems and processes;
Data to Actions in Intelligent in other words, for data-based models to fully become actionable. Grounded in this described data
Transportation Systems: modeling pipeline for ITS, we define the characteristics, engineering requisites and challenges intrin-
A Prescription of Functional sic to its three compounding stages, namely, data fusion, adaptive learning and model evaluation.
Requirements for Model Actionability. We deliberately generalize model learning to be adaptive, since, in the core of our paper is the firm
Sensors 2021, 21, 1121. https://
conviction that most learners will have to adapt to the ever-changing phenomenon scenario underly-
doi.org/10.3390/s21041121
ing the majority of ITS applications. Finally, we provide a prospect of current research lines within
Data Science that can bring notable advances to data-based ITS modeling, which will eventually
Academic Editor: Rashid Mehmood
bridge the gap towards the practicality and actionability of such models.
Received: 7 January 2021
Accepted: 2 February 2021
Published: 5 February 2021
Keywords: Intelligent Transportation Systems; functional requirements; machine learning; model
actionability; model evaluation
Publisher’s Note: MDPI stays neu-
tral with regard to jurisdictional clai-
ms in published maps and institutio-
nal affiliations. 1. Introduction
In the last years Intelligent Transportation Systems (ITS) have experienced an unpar-
alleled expansion for many reasons. The availability of cost-effective sensor networks,
Copyright: © 2021 by the authors. Li-
pervasive computation in assorted flavors (distributed/edge/fog computing) and the
censee MDPI, Basel, Switzerland.
so-called Internet of Things are all accelerating the evolution of ITS [1]. On top of them,
This article is an open access article
Smart Cities cannot be understood anyhow without Smart Mobility and ITS as techno-
distributed under the terms and con- logical pillars sustaining their operation [2]. Smartness springs from connectivity and
ditions of the Creative Commons At- intelligence, which implies that massive flows of information are acquired, processed,
tribution (CC BY) license (https:// modeled and used to enable faster and informed decisions.
creativecommons.org/licenses/by/ For the last couple of decades, ITS have grown enough to cross pollinate with previ-
4.0/). ously distant areas such as Machine Learning and its superset in the Artificial Intelligence

Sensors 2021, 21, 1121. https://doi.org/10.3390/s21041121 https://www.mdpi.com/journal/sensors


Sensors 2021, 21, 1121 2 of 34

taxonomy: Data Science. These days Data Science is placed at the methodological core of
works ranging from traffic and safety analysis, modeling and simulation, to transit network
optimization, autonomous and connected driving and shared mobility. Since the early 90’s
most ITS systems exclusively relied on traditional statistics, econometric methods, Kalman
filters, Bayesian regression, auto-regressive models for time series and Neural Networks,
to mention a few [3,4]. What has changed dramatically over the years is the abundance of
available data in ITS application scenarios as a result of new forms of sensing (e.g., crowd
sensing) with unprecedented levels of heterogeneity and velocity. Zhang et al. [3] have
defined this new form of data-driven ITS as the systems that have vision, multisource,
and learning algorithms driven to optimize its performance and augment its privacy-aware
people-centric character.
The exploitation of this upsurge of data has been enabled by advances in computa-
tional structures for data storage, retrieval and analysis, which have rendered it feasible to
train and maintain extremely complex data-based models. These baseline technologies have
laid a solid substrate for the proliferation of studies dealing with powerful modeling ap-
proaches such as Deep Learning or bio-inspired computation [5], which currently protrude
in the literature as the de facto modeling choice for a myriad of data-intensive applications.
However, significant consideration must be placed to the systematic and myopic selec-
tion of complex data-based solutions over well-established modeling choices. The current
research mainstream seems to be misleadingly focusing on performance-biased studies,
in a fast-paced race towards incorporating sophisticated data-based models to manifold
research area, leaving aside or completely disregarding the operational aspects for the
applicability of such models in ITS environment. The scope of this work is to review
existing literature on data-driven modeling and ITS, and identify the functional elements
and specific requirements of engineering solutions, which are the ultimate enablers for
data-based models to lead towards efficient means to operate ITS assets, systems and pro-
cesses; in other words, for data-based models to fully become actionable. Bearing the above
rationale in mind, this work underscores the need for formulating the requirements to be
met by forthcoming research contributions around data-based modeling in ITS systems.
To this end, we focus mainly on system-level on-line operations that hinge on data-based
pipelines. However, ITS is a wide research field, encompassing operations held at longer
time scales (e.g., long-term and mid-term planning) that may not demand some of the
functional requirements discussed throughout our work. Furthermore, our discussions
target system-level operations rather than user-level or vehicle-level applications, since in
the latter the information flow from and to the system is scarce. Nevertheless, some of the
described functional requirements for system-level real-time decisions can be extrapolated
to other levels and time scales seamlessly. From this perspective, our ultimate goal is to
prescribe – or at least, set forth – the main guidelines for the design of models that rely
heavily on the collection, analysis and exploitation of data. To this end, we delve into a
series of contributions that are summarized below:
• In the first place, we identify the gap between the data-driven research reported so far,
and the practical requirements that ITS experts demand in operation. We capitalize
on this gap to define what we herein refer to as actionable data-based modeling workflow,
which comprises all data processing stages that should be considered by any actionable
data-based ITS model. Although diverse data-based modeling workflows can be
found in literature with different purposes, most of them count on recognized stages,
that are presented in this work from an actionability perspective, i.e., what to take
into account from the operational point of view when designing the workflow, how to
capture and preprocess data, how to develop a model and how to prescribe its output.
These guidelines are proposed and argued within an ITS application context. However,
they can be useful for any other discipline in which data-based modeling is performed.
• Next, functional requirements to be satisfied by the aforementioned workflow are
described and framed in the context of ITS systems and processes, with examples
exposing their relevance and consequences if they are not fulfilled.The contributions
Sensors 2021, 21, 1121 3 of 34

of this section are twofold: on the one hand, we identify and define the holistically
actionable ITS model along with its main features; on the other hand, we enumerate
requirements for each feature to be considered actionable, as well as a review of the
latest literature dealing with these features and requisites.
• Finally, on a prospective note we elaborate on current research areas of Data Science
that should progressively enter the ITS arena to bridge the identified gap to action-
ability. Once the challenges of modeling and ITS requirements have been stated,
we review emerging research areas in Artificial Intelligence and Data Science that can
contribute to the fulfilment of such requirements. We expect that our reflexive analysis
serves as a guiding material for the community to steer efforts towards modeling
aspects of more impact for the field than the performance of the model itself.
As a summary, the contributions of this work consist of identifying the main actionabil-
ity gaps in the data-based modeling workflow, gathering and describing the fundamental
requirements for a system to be actionable, and considering both the requirements and
the usual data-based processing workflow, proposing solutions through the most recent
technologies. These contributions are organized throughout the rest of the paper as follows:
Section 2 delves into the actionable data-based modeling workflow, i.e., the canonical data
processing pipeline that should be considered by a fully actionable ITS system with data-
based models in use. Section 3 follows by elaborating on the functional features that an ITS
system should comply with so as to be regarded as actionable. Once these requirements are
listed and argued in detail, Section 4 analyzes research paths under vibrant activity in areas
related to Data Science that could bring profitable insights in regards to the actionability
of data-based models for the ITS field, such as explainable AI, the inference of causality
from data, online learning and adaptation to non-stationary data flows. Finally, Section 5
concludes the paper with summarizing remarks drawn from our prospect.

2. From Data to Actions: An Actionable Data-Based Modeling Workflow


ITS applications with data driven modeling problems underneath range from the
characterization of driving behavioral patterns, the inference of typical routes or traffic
flow forecasting, among others. Data driven modeling can be considered to include the
family of problems where a computational model or system must be characterized or
learned from a set of inputs and their expected outputs [6]. In the context of this definition,
actionability complements the data-driven model by prescribing the actions (in the form
of rules, optimized variable values or any other representation alike) that build upon the
output knowledge enabled by the model.
In general, a design workflow for data-based modeling consists of 4 sequential stages:
(1) data acquisition (sensing), which usually considers different sources; (2) data prepro-
cessing, which aims at building consistent, complete, statistically robust datasets; (3) data
modeling, where a model is learned for different purposes; and (4) model exploitation,
which includes the definition of actions to be taken with respect to the insights provided
by models in real life application scenarios. These 4 stages can be regarded as the core
of off-line data-driven modeling; however, when time dimension joins the game, a fifth
stage—adaptation—must be considered as an iterative stage of this data pipeline, aimed
at maintaining learned models updated and adapted to eventual changes in the data
distribution. This adaptation is crucial for real-life scenarios, where changes can happen
in all stages, from variations of the input data sources, to interpretation adjustments and
other sources of non-stationarity imprinting the so-called concept drift in the underlying
phenomenon to be modelled [7]. We now delve into these five data processing stages in
the context of their implementation in ITS applications, following the diagram in Figure 1.
The stages provided in Figure 1 can be considered as a standard workflow in any
data-based work; however, although these steps are easily recognisable, they are not
always regarded, and it is common to observe that practitioners put the focus only on a
subset of them, disregarding their interactions or omitting some of them. For instance,
the prescription stage is not frequently considered, while it is an essential link between the
Sensors 2021, 21, 1121 4 of 34

modeling outcome and the final decision/action derived from the modeling result. Besides,
each step can have implications for the final actionability of the model, reason for which all
of them are analyzed below.

CHANGING
CONTEXT
ADAPTATION
DATA UPCOMING
DATA

ACTION

Data Fusion Forecasting Management


Missing data decision making
imputation Classification Adaptive
Feature Automation
SENSING Data Engineering Pattern Planning
Transformation & Function Discovery decision making
Augmentation Approximation Optimization
MODELING PRESCRIPTION
PREPROCESSING
Figure 1. Data-based modeling workflow showing its main processing stages and their principal technology areas.

2.1. Data Acquisition (Sensing)


The path towards concrete data-based actions departs from the capture of available
ITS information, which in this specific sector is plentiful and highly diverse. The ad-
vent of data science for ITS has come along with the unfolding of copious data sources
related to transportation. Indeed, ITS are pouring volumes of sensed data, from the en-
vironment perception layer of intelligent and connected vehicles, to human behaviour
detection/estimation (drivers, passengers, pedestrians) and the multiple technologies
deployed to sense traffic flow and behaviour. Concurrently, many other non-traditional
sources that were useful to infer behavioral needs and expectations of people that use
transportation, such as social media, have started to become increasingly available and
exploited augmenting the more conventional sensing sources towards more efficient mobil-
ity solutions. Some of these data sources are currently used in almost any domain of ITS,
from operational perspectives such as the estimation of future transportation demands,
adaptive signaling or the discovery of mobility patterns, to the provision and of practical
solutions, such as the development of autonomous vehicles, although not all sources are
suitable for all applications. The model actionability is dependant on this early stage too,
reason for which the data selection (when possible) should not be neglected. For instance,
a model that consumes speed data will probably require some other measurements (maxi-
mum speed of a road segment) to provide in the end something meaningful, while a model
that consumes travel-time data will be more straight-forward.
Five main categories can be established to describe the spectrum of ITS data sources:
1. Roadside sensing, which brings together tools and mechanisms that directly capture
and convey data measurements from the road, obtaining valuable metrics such as
speed, occupation, flow or even which vehicles are traversing a given road segment.
These are the most commonly used sensors in ITS, most frequently based on computer
vision and radar, as they directly provide traffic information close to the point where
it originates. This kind of sensed metrics are useful for traffic flow or speed modeling,
allowing practitioners to identify mobility patterns and to model them, so future
behavior in sensorized locations can be estimated. Counting vehicles or detecting
their speed at a certain point of the road also allows to obtain network wide mobility
patterns that can be compared to those provided by a simulation engine. This can
help traffic managers and city planners take long-term decisions, such as which
road should be extended or how a road cut could affect other segments. However,
this information is tethered to the exact points where the sensors are placed, thus the
Sensors 2021, 21, 1121 5 of 34

actionability of a system built upon these data is subject to the geographical area
where such sensing devices are deployed and their range.
2. In-vehicle sensing, which includes a broad range of transponder devices that are
part of the on-board equipment of certain fleets. Commercial vehicles on land, air
and water usually have location devices that record and emit the position and other
metrics of the vehicles at all times. This opens up a wide range of ITS applications,
such as fleet management [8], route optimization [9], delay analysis and detection [10],
airport/port management [11] or, when the vehicles are a part of traffic, a detailed
analysis of their behavior along a complete route (not only in certain sensorized
locations) [12]. This technology is highly extended in commercial fleets and its
multiple analytic applications are nowadays remarkably actionable, due to industry
standards requirements. However, machine learning modeling based approaches are
starting to emerge, and should consider actionability as a core concern.
3. Cooperative sensing, which denotes the general family of data collection strategies
that regards the information provided by different users of the ITS ecosystem as a
whole, thus being grouped and jointly processed forward. This inner perspective of
traffic and transportation can be obtained through many mechanisms, and, although
it is more specific and scarce, it is also more complete than the one obtained from road-
side sensing. These data open the door to mobility profiling and anomaly detection,
enriching the outlook of a transportation model by means of the fusion of different
data-based views of an ITS scenario. This includes all forms of mobile sensing data,
from call detail record data that can be used to obtain users trajectories [13], to GPS
data [14]. These sources are the foundation of abundant research [15,16], but in most
cases the data fusion part is obviated. Crowdsourced and Social Media sensing can
be analogously considered in this category. These data sources can also contribute to
data-based ITS models by means of sentiment analysis and geolocation. The use of
crowd-sourced data is well established among technology-based companies (Google,
Uber etc), yet not very often available to research community and private and public
authorities in the transport operations management. The limited information that
becomes available is deprived from the necessary statistical representativeness and
truthfulness in order to be easily integrated to legacy management systems.
4. External data sources, which include all data that are not directly related to traffic of
demand, but have an impact on it, such as weather, calendar, or planned events, social
and economic indicators, demographic characteristics etc. These data are usually
easy to obtain, and their incorporation to ITS models augments in general the quality
of their produced insights and ultimately, the actionability of the actions yielded
therefrom. It is also true that this data source is typically unstructured, which can
pose a challenge regarding its automatic integration.
5. Structured/static data, which refers to data sources that provide information of ele-
ments that have a direct impact on transportation, such as public transportation lines
and timetables, or municipal bike rental services. Due to their inherently structured
nature, data provided by these sources are often arranged in a fixed format, making it
easier to incorporate to subsequent data-based modeling stages. Any of the previous
data and applications can be enriched with these kind of data; a model that is able to
represent the mobility of a city would probably enhance its capabilities if it considered
these data. For instance, a bus timetable can help understand traffic in the street
segments that are traversed by the bus service or where its stops are located. These
information sources must be considered for an intelligent transportation system to be
actionable, being a particularly essential piece of urban and interurban mobility.

2.2. Data Preprocessing


The variety of the above mentioned sensing sources comes with promises and per-
ils. These data is produced in various forms and formats, various time resolutions, syn-
chronously or asynchronously and different rates of accumulation. To leverage the full
Sensors 2021, 21, 1121 6 of 34

spectrum of knowledge these data can bring to the sake of informed decision making,
the more the sensing opportunities the larger the needs for powerful preprocessing and
skills are before reaching the stage of modeling.
A principled data-driven modeling workflow requires more than just applying off-
the-shelf tools. In this regard, preprocessing raw data is undoubtedly an elementary step
of the modeling process [17], but still persists nowadays as a step frequently overlooked by
researchers in the ITS field [18].
To begin with, when a model is to be built on real ITS data, an important fact to be taken
into account is the proneness of real environments to present missing or corrupted data due
to many uncertain events that can affect the whole collection, transformation, transmission
and storage process [19]. This issue needs to be assessed, controlled and suitably tackled
before proceeding further with next stages of the processing pipeline. Otherwise, missing
and/or corrupted instances within the captured data may distort severely the outcome of
data-based models, hindering their practical utility [20]. A wide extent of missing data
imputation strategies can be found in literature [21,22], as well as methods to identify,
correct or discriminate abnormal data inputs [23]. However, they are often loosely coupled
to the rest of the modeling pipeline [24]. An actionable data preprocessing should focus not
only on improving the quality of the captured data in terms of completeness and regularity,
but also on providing valuable insights about the underlying phenomena yielding missing,
corrupted and/or outlying data, along with their implications on modeling [25].
Next, the cleansed dataset can be engineered further to lie an enriched data substrate
for the subsequent modeling [26,27]. A number of operations can be applied to improve the
way in which data are further processed along the chain. For instance, data transformation
methods can be applied for different purposes related to the representation and distribution
of data (e.g., dimensionality reduction, standardization, normalization, discretization or
binarization). Although these transformations are not mandatory in all cases, a deep
knowledge of what input data represent and how they contribute to modeling is a key
aspect to be considered in this preprocessing stage.
Furthermore, data enrichment can be held from two different perspectives that can
be adopted depending on the characteristics of the dataset at this point. As such, feature
selection/engineering refers to the implementation of methods to either discard irrelevant
features for the modeling problem at hand, or to produce more valuable data descriptors
by combining the original ones through different operations. Likewise, instance selec-
tion/generation implies a transformation of the original data in terms of the examples.
Removing instances can be a straight solution for corrupted data and/or outliers, whereas
the addition of synthetic instances can help train and validate models for which scarce real
data instances are available. Besides, these approaches are among the most predominant
techniques to cope with class imbalance [28], a very frequent problem in predictive model-
ing with real data. Whether each of these operations is required or not depends entirely on
the input data, their quality, abundance and the relations among them. This entails a deep
understanding of both data and domain, which is not always a common ground among
the ITS field practitioners [29].
Finally, data fusion embodies one of the most promising research fields for data-driven
ITS [3,30], yet remains marginally studied with respect to other modeling stages despite its
potential to boost the actionability of the overall data-based model. Indeed, an ITS model
can hardly be actionable if it does not exploit interactions among different data sources.
Upon their availability, ITS models can be enriched by fusing diverse data sources. A recent
review on different operational aspects of data-driven ITS developments states that these
models rarely count on more than one source of data [16]. This fact clearly unveils a
niche of research when taking into account the increasing availability of data provided
by the growing amount of sensors, devices and other data capturing mechanisms that are
deployed in transportation networks, in all sorts of vehicles, or even in personal devices
held by the infrastructure users. Despite the relative scarcity of contributions dealing with
this part of the data-based modeling workflow, the combination of multiple sources of
Sensors 2021, 21, 1121 7 of 34

information has been proven to enrich the model output along different axis, from accuracy
to interpretability [31–34].

2.3. Modeling
Once data are obtained, fused, preprocessed and curated, the modeling phase implies
the extraction of knowledge by constructing a model to characterize the distribution of
such data or their evolution in time. The distillation of such knowledge can be performed
for different purposes: to represent unsupervised data in a more valuable manner (as in
e.g., clustering or manifold learning), for instance, to insight patterns relating the input
data to a set of supervised outputs (correspondingly, classification/regression) aiming to
automatically label unseen data observations, to predict future values based on the previous
values (time series forecasting), or to inspect the output produced by a model when
processing input data (simulation). To do so, in data-based modeling machine learning
algorithms are often put to use, which allow automating the modeling process itself.
The above purposes can serve as a discrimination criterion for different algorithmic
approaches for data-based modeling. However, when the goal is to model data interac-
tions within complex systems such as transportation networks, it is often the case that
the modeling choice resorts to ensembles of different learner types. For instance, when
applying regression models for road traffic forecasting, a first clustering stage is often
advisable to unveil typicalities in the historical traffic profiles and to feed them as priors
for the subsequent predictive modeling [35–37]. However, when it comes to model action-
ability, a key feature of this stage is the generalization of the developed model to unseen
data. This characteristic implies making a model useful beyond the data on which it is
trained, which implies that the model design efforts should not only be put on making the
model achieve a marginally superior performance, but also to be useful in other spatial
or temporal circumstances. Achieving good generalization properties for the developed
can be tackled by diverse means, which often depend on the modeling purpose at hand
(e.g., cross-validation, regularization, or the use of ensembles in predictive modeling).
Essentially, the design goal is to find the trade-off between performance (through repre-
senting much of the intrinsic variance of data) and generalization (staying away from an
overfitted model to a particular training set). This aspect becomes utterly relevant when
data modeling is done on time-varying data produced by dynamic phenomena. ITS are,
in point of fact, complex scenarios subject to strong sources of non-stationarity, thereby
calling for an utmost focus on this aspect.
The complexity met in traffic and transportation operations is usually treated with
heterogeneous modeling approaches that aim to complement each other to improve ac-
curacy [38–40]. This can be done either by comparing different models and selecting the
most appropriate one every time, or by combining different models to produce the final
outcome. Additionally, in some fields of ITS, such as traffic modeling, physical (namely,
theory- or simulation-based) models have been available for decades. Their integration into
data-based modeling workflows, considering the knowledge they can provide, can become
crucial for a manifold of purposes, e.g., to enforce traffic theory awareness in models
learned from ITS data. Indeed, the hybridization of physical and data-based models
has a yet to be developed potential that has only been timidly explored in some recent
works [41–43].
Interestingly, complex data driven modeling solutions to transportation phenomena
have been numerous and resourceful ranging from modular structures, to model combina-
tions, surrogate modeling [44] and so on. Regardless of the approach, literature emphasizes
on the critical issue of model hyperparameter optimization using for example nature
inspired algorithms, namely Evolutionary Computation or Swarm Intelligence [39,45].
Assuming that there is a feasible and acceptable solution to the problem of selecting the
proposed parameters for a data drive model, when dealing with complex modeling struc-
ture this task should be conducted automatically by optimizing the hyperparameter space
usually based on the models’ predictive error. It is to note that, the greater the number of
Sensors 2021, 21, 1121 8 of 34

models involved the more difficult the optimization task becomes. Moreover, relying on
nature inspired stochastic approaches, full determinism in the solution and convergence
stability can not be formally guaranteed [5].

2.4. Prescription
Once the modeling phase itself has been completed, the resulting model faces its
application to a real ITS environment. It is at this stage when actions deriving from the
data insights are defined/learned/decided, and when the actionability of the model can be
best assessed. Yet, this stage is frequently overlooked in most ITS research, where most
works conclude at presenting the good performance of a model; it is uncommon to find
evaluations of a given model in terms of its final application in a certain environment.
Are the actions that can be taken as a result of the outcome of a data-based model aimed at
a strategic, tactical or operational decision making? Is the output of the data-based model
able to support decisions made by transportation networks managers? Can the output
be consumed directly without any need for further modeling, or exploited by means of a
secondary modeling process aimed at optimizing the decision making process? This latter
case can be exemplified, for instance, by the formulation of the decision making process as
an optimization problem, in which actions are represented by the variables compounding
a solution to the problem, and the output of the previous data-based modeling phase can
be used to quantitatively estimate the quality or fitness of the solution. One of the most
prominent examples of this prescription mode deals with routing problems, since they often
use simulation tools or predictive models to assess the travel time, pollutant emissions or
any other optimization objective characterizing the fitness of the tested routes [46,47]. Other
examples of prescription based on data emerge in tactical and strategic planning, such as
the modification of public transportation lines [48], the establishment of special lanes
(e.g., taxi, bike) [49], the improvement of road features [50], the adaptive control of traffic
signaling [51], the identification of optimal delivery (or pickup) routes for different kinds
of transportation services [52], the incident detection and management [53], learning for
automated driving [54], or the design of sustainable urban mobility plans based on the
current and future demand or the drivers’ behavior [55,56].
In any of the above presented ITS cases, a data-based model should be equipped with
a certain set of features that guarantee its actionability. For instance, if a traffic manager
is not able to interpret a model or understand its outcome in terms of confidence, it can
be hardly applied for practical decision making. When the model is used for adaptive
control purposes (as in automated traffic light scheduling), the adaptability of the model
to contextual changes is a key requirement for prescribed actions to be matched to the
current traffic status [57]. Interestingly, some control techniques with a long history in
the field (e.g., Stochastic Model Predictive Control, SMPC, [58]) serve as a good example
of the triple-play between application requirements, decision making and data-based
models. When dealing with the design of control methods in ITS, SMPC has been proven
to perform efficiently in highly-complex systems subject to the probabilistic occurrence of
uncertainties [59]. Specifically, SMPC leverages at its core data-based prediction modeling
and low-complexity chance-constrained optimization to deal with control problems that
impose that the method to be used must operate in real time. In this case, and in most
actionable data-based workflows where decision making is formulated as an optimization
problem, we note a clear entanglement between application requirements (e.g., real-time
processing), decision making (low-complexity, dynamic optimization techniques) and
data-based models (predictive modeling for system dynamics forecasting).

2.5. Adaptation
Finally, the proposed actionable data processing workflow considers model adaptation
as a processing layer that can be applied over different modeling stages along the pipeline.
When models are based on data, they are subject to many kinds of uncertainties and non-
stationarities that can affect all stages of the process. Streaming data initially used to build
Sensors 2021, 21, 1121 9 of 34

the model can experience long-term drifts (for instance, an increase of the average number
of vehicles), sudden changes (a newly available road), or unexpected events (for example,
a public transportation strike) [60–62]. A closed lane, a new tram line, the opening of a
tunnel or simply the opening of a new commercial center, may change completely the way
in which network users behave, and thus, affect the data-based models that are intended
to reflect such a mobility. Therefore, data-based modeling cannot be conceived as a static
design process. This critical adaptation should be considered in all parts of the workflow,
and constantly updated with new data:
1. In the preprocessing stage, adaptation could be understood from many perspectives:
the incorporation of new sources of data, the partial or total failure of data capturing
sensors, which lead to an increased need for data fusion, imputation, engineering
or augmentation.
2. In the modeling stage, adaptations could range from model retraining, adaptation to
new data or alternative model switching, to the change of the learning algorithm due
to a change in the requested system requirements (for instance, in terms of processing
latency any other performance indicator).
3. In the prescription stage, adaptation is intended to dynamically support decisions
accounting for changes in data that propagate to the output of preceding modeling
stages. Data-based modeling can deal with such changes and adapt their output
accordingly, yet they are effective to a point. For instance, online learning strategies
devised to overcome from concept drift in data streams can speed up the learning
process after the drift occurs (by e.g., diversity induction in ensembles or active
learning mechanisms). Unfortunately, even when model adaptation is considered
the performance of the adapted model degrades at different levels after the drift.
Extending adaptation to the prescription stage provides an additional capacity of
the overall workflow to adapt to changes, leveraging techniques from prescriptive
analysis such as dynamic or stochastic optimization.
Adaptations within the above stages can be observed from two perspectives: auto-
matic adaptations that the system is prepared to do when certain circumstances occur,
or adaptations that are derived from changes that are introduced by the user. Thus, the adap-
tation layer is strongly linked to actionability: an ITS model will be more actionable if
adaptations, either needed or imposed, are accessible to its final users. For instance, a sys-
tem could be required to introduce a new set of data, and its impact on all the stages should
be controlled by the transportation network manager, or if a drift is detected, the system
should consider if it is relevant to inform the user.

3. Functional Requirements for Model Actionability


Any data-based modeling process should embrace actionability as its most desirable
feature for the engineered model to yield insights of practical value, so that field stake-
holders can harness them in their decision making processes. This is certainly the case
of ITS, in which managers, transportation users and policy makers rely on models and
research results to make better and more informed decisions. Thus, once the main stages
of data-driven modeling have been outlined, this section places the spotlight on the main
functional features that should be mandatory to produce fully-actionable ITS data-based
models. These functional requirements, which are shown in Figure 2, should not be under-
stood as a compulsory list of features, but rather as an enumeration of possibilities to make
a model actionable. Not all ITS scenarios requiring actionable data-based models should
impose all these requirements, nor can actionability be thought to be a Boolean property.
Different loosely defined degrees of actionability may hold depending on the practicality
of decisions stemming from the model.
Sensors 2021, 21, 1121 10 of 34

ACTIONABILITY

USABILITY
APPLICATION CONTEXT For ATIS
Operative context For ATMS
Regulatory context Interpretability
Privacy Confidence based output
Social context Trade-off accuracy-usability
Environmental context SELF SUSTAINABILITY
USABLE THEORY
TRAFFIC Adaptability
Robustness
Integrability Stability & Resilience
Relevance Scalability
Large-scale application
TRANSFERABILITY
Figure 2. Functional requirements for actionable data-based models in Intelligent Transportation
Systems (ITS). ATIS: Advanced Traveler Information Systems; ATMS: Advanced Transportation
Management Systems.

3.1. Usability
The way in which humans interact with information systems has been thoroughly
studied in last decades and formalized under the general usability term [63]. Although
usability is a feature that can be associated to any system in which there is some kind
of interaction with the user, most of its definitions to date gravitate around the design
of software systems [64–66], which is not necessarily the case of ITS research. Usable
designs imply defining a clear purpose for a system, and helping users making use of
it to reach their objectives [67]. Within ITS, there are domains where this definitions
apply directly [68], such as vehicle user interfaces [69–71], the development of navigation
systems [72], road signalization [73], or even the way in which public transportation
systems information is shown to users [74,75].
The aforementioned domains of application, and mostly any system lying at the core
of Advanced Traveler Information Systems (ATIS), have an explicit interaction component.
On the other hand, models developed for Advanced Transportation Management Systems
(ATMS) are less related to user interaction (beyond the interface design of decision making
tools), hence this canonical definition of usability seems to be less applicable. However,
the general concept of usability can also accommodate the notion of utility as the quality of
a system of being useful for its purpose, or the concept of effectiveness, in regards to how
effective is the information provided by them [76]. Since ITS are systems developed as
tools designed to help the different stakeholders that take part in transportation activities,
the actionability of data-based models used for this regard depends stringently in this
general idea of usability [77]. Models’ usability is a feature largely disregarded in literature.
A clear example of this situation is traffic forecasting, a preeminent subfield of ATMS,
in which the link between the high end deep learning models with the requirements by the
road operators in forecasts to support the decision making is very weak [78].
Usability may relate to the person that is going to operate the model, and to the
type and complexity of the model, which relate to specific skills. Achieving usable ITS
models does not entail the same efforts for all ITS subdomains. Thus, while for research
contributions related to ATIS there is a clear interest in this matter [79], for ATMS develop-
ments some extra considerations need to be made. Usability in ITS has, therefore, a facet
oriented towards user interface, where interfaces reflect at least one of the outputs of an
ITS data-based model, and another facet towards creating models that are more aware of
the way their outputs are going to be consumed afterwards by the decision maker.
Sensors 2021, 21, 1121 11 of 34

3.1.1. User Interface


For the first of these facets, Spyridakis et al. [77] propose general software usability
measuring tools and scales such as System Usability Scale (SUS) [65], ethnographic field
studies, or even questionnaires. These basic techniques are also proposed in [80] in order
to evaluate navigation systems interfaces. There are also many other evaluation measures
that are more specific to the field, such as [81], or those defined by public authorities [82].
Some of the main techniques to appraise ITS interface usability are:
• Usability techniques: if the output of the developed model is consumed through
the use of an interface, common techniques like asking directly the users about their
experience can be adopted [80]. Among them SUS surveys are the standard to provide
interpretable metrics that can be used for the evaluation of passenger information
systems [83] or any other kind of automated traveler information system [79].
• Quality of the provided information: in [76], another perspective is proposed, based
on estimating the quality of the information provided by the model. Characteristics
such as the means to access the information, the reliability of the information provider,
or the awareness of the information availability can be measured for assessing the
model’s usability.
• Transportation-aware strategies: an alternative way to measure usability is to take
into account the transportation context and how the use of the model impacts the
system. As many of these systems are used during the course of transportation,
the environment must be considered in order to provide an adequate and pertinent
output [82]. This particular aspect is regarded below in Section 3.4.
• Public transportation guidelines: when ITS developments are intended for the pub-
lic domain, inclusion of disadvantaged collectives in the usability evaluation is a
must [81]. The extent in which these concerns are addressed by the ITS solution
should not be disregarded.

3.1.2. Consumption of the Model’s Output


For this second usability facet, there are no scales or measurements in literature that
provide an objective (or even subjective) usability assessments, but we propose some angles
that should be considered when designing this kind of models:
• Confidence-based outputs: data-driven models are often subject to stochasticity as a
result of their learning procedure or the uncertainty/randomness of their input data
(as specially occurs in crowdsourced and Social Media data). This randomness im-
prints a certain degree of uncertainty in their outputs, which can be estimated values,
predicted categories, solutions to an optimization problem or any other alike. Such out-
comes are often assessed in terms of their similarity to a ground truth in order to
quantitatively assess the performance of the data-based model. Thus, a practitioner
aiming to make decisions based on the model’s output is informed with a nominal
performance score (which has been computed over test data), and the predicted output
for a given input. However, when one of such data-based models is intended to work
in a real environment, there is no ground truth to evaluate the quality of the result
they are providing towards making a decision.For instance, a predictive model could
score high on average as per the errors made during the testing phase. However,
predictions produced by the model could be less reliable during peak hours than
during the night, being less trustworthy in the first case as per the variability of the
data from which it was learned, and/or the model’s learning algorithm itself. For this
reason, the estimation of the confidence of outputs from a data-based model must be
analyzed for the sake of its usability. For example, a public transportation model that
provides outlooks of future demand could be more usable if, besides the estimation
itself, some kind of confidence metric was provided. Elaborating on this aspect is not
very frequent in academic research, mainly due to the fact that confidence is not al-
ways that easy to obtain and the estimation procedure is, in most cases, model-specific,
requiring a previous statistical analysis of input data to properly understand their
Sensors 2021, 21, 1121 12 of 34

variability and characteristics. Unfortunately, such a confidence analysis is usually


left out of the scope of research contributions, which rather focus on finding the best
scoring model for a particular problem. Exceptions to this scarcity of related works
are [84], in which the uncertainty inherent to artificial neural networks is analyzed in
a real ITS context; [85], in which a committee of different models provides intervals
of confidence to predictions;or the more recent contribution in [86], which departs
from previous findings in [87,88] to estimate the uncertainty of traffic demand. This
uncertainty estimation is then used as an input to assess the confidence of traffic
demand predictions. These few references exemplify good practices that should be
universally considered in contributions to appear.
• Interpretability: a stream of work has been lately concentrated around the noted need
for explaining how complex models process input data and produce decisions/actions
therefrom. Under the so-called XAI (eXplainable Artificial Intelligence) term, a torrent
of techniques have been reported lately to explain the rationale behind traditional
black-box models, mainly built for prediction purposes [89,90]. Nowadays, Deep
Learning is arguably the family of data-driven models mostly targeted by XAI-related
studies [91,92].
The interest of transport researchers to interpretable data-driven models is not new;
intuitively, any decision in transportation and traffic operations should be based on a
solid understanding of the mechanism by which different factors interact and influence
transportation phenomena [93]. In the transportation context explainability is closely
related to integrability, when it comes to traffic managers, as ensuring that data-based
models can be understood by non-AI expert can make them appropriately trust and
favor the inclusion of data-based models in their decisional processes. When framed
within ITS systems and processes, the need for explainable data-based models can help
decision makers understand how information is processed along the data modeling
pipeline, including the quantification of insightful managerial aspects such as the
relationship and sensibility of a predicted output with respect to their inputs.
• Trade-off between accuracy and usability: when ITS data-based models aim at superior
performances, they often work in ideal scenarios where the real context of application
is disregarded; should that context apply in practice, the claimed suitability of the
developed model for its particular purpose could be compromised. For instance,
the goodness of an ITS model devised to detect users’ typical trajectories can be
measured with regard to the exactitude of the detected trajectories. If the pursuit of a
superb performance relies on a constant stream of data (hence, eventually depleting
the user’s phone battery), it could be a pointless achievement when put to practice.
This particular example has been already considered by plenty of researchers [94,95].
However, there is a long way to go in this aspect, as most ITS research developments
consider only ideal circumstances without regarding the implications that an accurate
design could have on its final usability.

3.2. Self-Sustainability
In general, self-sustainability of a model refers to its ability to survive—hence, to
continue to be useful—in a dynamic environment. ITS models and developments are
usually intended to operate during long periods of time. However, it is widely accepted
that traffic and transportation phenomena are strongly dynamic in nature, meaning that
these phenomena exhibit long term trends, evolve in space and time, but also, at the
occurrence of an unexpected event, they are susceptible to abrupt changes and exhibit long
term memory effects. For instance, a trip information system based on traffic forecasts
on a certain part of the network trained with historical data coming from recurrent traffic
conditions may not be easily transferable to other road networks or not efficient in case
of a severe disruption in traffic operation (accident). What is more, if the specific system
does not undergo constant training with new data over time, eventually it will fail to
correctly operate even for the network location it was originally designed to operate due
Sensors 2021, 21, 1121 13 of 34

to contextually induced non-stationarities. Thus, an intelligent transportation system


developed based on data-based approaches should at least follows a set of minimum
self-sustainability requirements during the design workflow.
To better understand the importance of self-sustainability as a significant aspect
of model’s actionability, one should bring to mind the case of cooperative ITS systems
(e.g., advanced vehicle control systems) and the automated driving. To this end, a self-
sustainable data-based model should bridge the gap between the development of a model
prototype and its deployment in a real, potentially non-stationary environment.
When an ITS system or model is deployed to operate in changing conditions, self-
sustainability involves dealing with the effects of such changes in the learned knowledge.
To this end, different strategies and design approaches could be required depending on
the nature of the change and its effects on the model. We next delve into several attributes
that can be desirable to deploy data-driven systems or models in changing environments,
rendering them actionable:
1. Adaptable: Data-driven models for ITS applications created in controlled conditions,
with static, self-contained datasets, can provide great performance metrics, but could
also fail if data evolve along time [96]. Adaptation is the reaction of a system, model or
process to new circumstances intended to reduce its performance deterioration in
comparison to the one expected before the change in the environment happened.
If data change over time, their evolution is not detected by the model and it does not
adapt to it whatsoever, then the developed model will eventually provide an obsolete
output. When these contextual variations occur over data streams and models are
learned on-line therefrom (for e.g., on-line clustering or classification), such variations
can imprint changes in the statistical distribution of input and/or output data, making
it necessary to update such models to reflect this change in their learned knowledge.
This phenomenon is known as concept drift [97], and has been identified as an active
research challenge for most of fields connected to machine learning in non-stationary
environment [98]. Many of those fields are already studying this topic, from spam
detection [99,100] to medicine [101].
There are two main lines related to concept drift: how to detect drift, and how to
adapt to it. Both lines should be scheduled in the research agenda of data-driven ITS,
as they have obvious implications when analyzing traffic [102]. Situations like road
works can modify completely traffic profiles over a certain area during a period of
time, after which the situation goes back to normal. A similar casuistry happens with
road design changes (i.e., new lanes, transformation of types of lanes, new accesses,
roundabouts, etc), although in those cases there is a new stable traffic profile largely
after the change. Even without man-crafted changes, traffic profiles may change
for social-economical reasons [103]. Besides, analysis of drift can be used to detect
anomalies in the normal operation of roads [104], or to analyze patterns in maritime
traffic flow data [105]. However, the adaptability of ITS models to evolving data is
scarcely found in literature, and certainly, in many cases concept drift management is
the scope of the work, and not a circumstance that is considered to achieve a greater
goal [104,106]. There are though some online approaches to typical ITS problems
that consider the effects of drift in data [36,107,108], and we consider this kind of
initiatives should lead the way for an actionable ITS research.
2. Robust: When an ITS system is deployed in a real-life environment, diverse kinds of
setbacks can affect its normal operation, from power failures that preclude its function-
ing to the interruption of the input data flow. Robustness is a self-sustainability trait
that prevents a system to stop working when external disruptions occur. Although in
most research-level designs this is not a relevant feature, it is essential for actionable,
self-sustainable designs. Robustness, defined as the ability to recover from failures,
would have, however, different requirements depending on the criticality of the ITS
system. Thus, in a traffic flow forecasting system robustness could only imply that the
system does not crack when input data fail [109], and it continues to operate; on the
Sensors 2021, 21, 1121 14 of 34

other hand, for critical systems such as air traffic management, robustness would
require additional measurements to contain damage [110,111]. All in all, robust data-
based workflows should be able to accommodate unseen operational circumstances,
such as data distribution shifts or unprecedented levels of information uncertainty,
which particularly prevail in crowdsourced and Social Media data [112,113].
3. Stable and resilient: Actionable systems require a certain output stability in order to
be understandable by their users. This notion is apparently opposed to adaptability,
but while the latter is the ability to adapt the output to environment or data changes,
stability pursues maintaining the output statistically bounded even when contextual
changes occur, through e.g., model adaptation techniques. When adaptation is not
perfect and the model violates a given level of statistical stability, stability requires
another kind of adaptation, namely resilience, to make the model return to its normal
operation and thus, minimize the impact of external changes on the quality of its
output [114]. This entails, in essence, going one step further in the knowledge of
the environment and taking into account those circumstances that can affect the
system, and it could be linked to transferable models, which would be addressed
below. For instance, a traffic volume characterization model would be adaptable
if it considers the changes inherent to traffic volume (an increase over time due to
economical factors), and it would be stable if a change in the weather conditions does
not deteriorate its performance, or in other words, it has considered this essential
circumstance. These kind of considerations are almost nonexistent in literature [78],
but however crucial for a model to be self-sustainable.
4. Scalable: In the research environment, tests are run in a delimited scale, constrained to
the size of data, and useful for the experiments, in contrast with large, multi-variate
real environments. Scaling up is not, of course, a matter of ITS research, but an
engineering problem. However, models should be designed to be scalable since
their conception.
Leaving aside calibration and training phases, classic transportation theories tend in
general to be computationally more affordable than data-driven models. However,
the unprecedented amount of computing power available nowadays discards any
real pragmatic limitation due to the computational complexity of learning algorithms
in data-based modeling. An exception occurs with models falling within the Deep
Learning category which, depending on their architecture and size of training data,
may require specialized computing hardware such as GPU or multi-core equipment.
Nevertheless, the rising trend in terms of scalability is to make data-based models
incremental and adaptable [3], which finds its rationale not only in the environmental
sustainability of data centers (lower energy consumption and thereby, carbon foot-
print), but also in the deployment of scalable model architectures on edge devices,
usually with significantly less computing resources than data centers.
Although some ITS problems are easier to scale and this feature would not be trouble-
some, there are some fields that can be very sensitive to scalability. For instance,
route planners frequently consist of shortest-path problem and travel-salesman
problem implementations that increase in complexity when the number of nodes
grow [115]. This is a good example where artificial intelligence and optimization
tools provide solutions that are actionable in terms of scalability, and where cases
are found effortlessly [116,117]. Caring about aspects like the easiness to introduce
new variables when needed, the complexity of tuning if applies, or the execution
time, would make a model more actionable, by increasing its self-sustainability. This
need for scalability is not just a matter related to the computational complexity of
modeling elements along the pipeline, but also links to the feasibility of migrating the
designed models from a lab setup to a, e.g., Big Data computing architecture. Unfor-
tunately, scarce publications reflect nowadays on whether their proposed data-based
workflows can be deployed and run on legacy ITS systems, thereby avoiding costly
upgrade investments in computing equipment.
Sensors 2021, 21, 1121 15 of 34

3.3. Traffic Theory Awareness


Theoretical representations of traffic attempt to construct (mostly simple) models with
causal aspects. These models are usually of a closed form and are frequently dictated by
simplifying assumptions, which leads to limited performance when modeling complex
spatio-temporal dynamics in the microscopic analysis context. In these models, data are
instrumental to estimate how well they fit real world conditions. On the other hand,
and since their upsurge in the 80s, data driven models rely exclusively on the data to
extract the dynamics that govern the phenomena. This, at least theoretically, makes them
more adaptable and more efficient in complex conditions when compared to theory based
models. But, they can hardly claim applicability in large scale scenarios (city level traffic
management) due to significant computational resources requirements. Such data-driven
traffic models have been systematically implemented as proof of concepts and are now
dominant in Traffic Engineering literature [16], incorporating most well-known advanced
techniques, and, in many cases, ignoring the elementary knowledge of traffic and focusing
blindly on performance.
Owing to the above, researchers in traffic modeling have diversified the way in
which their models are developed and evaluated, fitting them to the technology that is
introduced, as opposed to fitting the model to the well established knowledge described in
well established theories of traffic flow. This results in models that are hardly actionable for
traffic engineers, in terms of integration to legacy traffic control and management systems
and relevance to the decision making process of road network operators. Besides, there is a
lack of standards in what regards to data and scenarios used to assess the performance,
usually due to the availability of real data for each researcher. This was already identified
in [78], where test-beds were proposed, either generating them or using some of the existent
as standards. This would help compare models, understand them better, as they can be
evaluated in a known environment, and obtain their insights concerning traffic theory.
Besides, as we anticipated in Section 2.1 there is a industrial trend towards the consideration
of different data sources when modeling traffic dynamics. In many cases, these data sources
do not have any straightforward relationship to traffic itself. The integration of these sources
of data, the models learned from them and theoretical representations of transportation
scenarios remains an open challenge that has started to be addressed in literature [118–120].
In this line of reasoning, linking data-driven to theory based models in transporta-
tion may resort to efficient and physically consistent representations of transportation
phenomena. In fields like traffic modeling and forecasting, this hybrid approach permits
to consider theoretical aspects of traffic, such as the relationship among speed, flow and
density, the three phases of traffic [121], or the Breakdown Minimization Principle [122]
when modeling bottlenecks. The consideration of these theoretical concepts takes effect
mainly in the preprocessing, modeling and prescription phases of the modeling work-
flow. In preprocessing, domain knowledge can be crucial for feature engineering, by
describing how available features are related to each other, estimating collinearities in
advance, deleting irrelevant predictors, or obtaining feature combinations with improved
modeling power [78]. Applying traffic theories and principles can also be useful for data
augmentation and missing data imputation, by simulating or generating data that are more
akin to what the context can provide [36]. In the modeling phase, previously defined
mathematical frameworks can help define the constraints, operation ranges and correct the
output of data-based models, which do not take into account the compliance of their output
with respect to well-established theories. Lastly, in the prescription phase, model outputs
can be linked to traffic theory knowledge to improve the way in which they are applied:
a predicted flow value can be more useful if the travel time or the bottleneck probability
can be computed afterwards. Furthermore, in the case of predictive models, they can
reach a point in which the provided predictions ultimately affect the future behavior of the
models themselves, if they are trained only with observed past data. For instance, a model
that assists traffic management decisions, like closing a lane, might lead to a situation
that has not been observed by the model before, thus making the knowledge captured
Sensors 2021, 21, 1121 16 of 34

by the traffic model obsolete and useless until the data captured from the environment
is exploited for retraining. Physical models can be highly useful to anticipate scenarios
and complement data-based models, providing additional information of what theories or
simulations determine that the behavior of the scenario should be.
This emergent modeling paradigm is known as Theory Guided Data Science, and aims
to enhance data driven models by integrating scientific knowledge [123]. The main ob-
jective of this approach is to enable an insightful learning using theoretic foundations of
a specific discipline to tackle the problem of data representativeness, spurious patterns
found in datasets, as well as providing physically inconsistent solutions. From the al-
gorithmic point of view, this induction of domain knowledge can be done in assorted
means, such as the use of specially devised regularization terms in predictive models
(e.g., in the loss function of Deep Learning models), data cleansing strategies that account
for known data correlations, or memetic solvers that incorporate local search methods
embedding problem-specific heuristics. In transportation, there has been several example
of theory enhanced models departing from traffic conditions identification and characteri-
zation [124,125], to data driven and agent based traffic simulation models for control and
management [3,42,126,127], or cooperative intelligent driving services [128].
Awareness with domain-specific knowledge can be also enforced at the end of the
workflow. When decision making is formulated as an optimization problem, the family of
optimization strategies known as Memetic Computing [129,130] has been used for years
to incorporate local search strategies compounded by global search techniques and low-
level local search heuristics. These heuristics can be driven by intuition when tackling
the optimization problem at hand or, more suitably for actionability purposes, by a priori
knowledge about the decision making process gained as a result of human experience
or prevailing theories. For instance, traffic management under incidences in the road
network can largely benefit from the human knowledge acquired for years by the manager
in charge, since this knowledge may embed features of the traffic dynamics that are not
easily observable from historical data. This knowledge can be inserted in an optimization
algorithm devised to decide e.g., which lanes should be rerouted in an accident.

3.4. Application Context Awareness


Transportation is exceptionally diverse around the world, with notable differences in
modes, preferences and availability due to social, economic and cultural disparity. More-
over, Intelligent Transportation Systems with different purposes have also characteristic
requirements that can also be very divergent with respect to space and time. To address
this landscape of complex and some times conflicting goals, policies and decision making
should span from few seconds (traffic management and control) to years (planning and
designing of new systems). It is strongly argued that data driven framework are able to
cope with context aware datasets, due to their inherent capabilities of learning patterns
hindering in resourceful data and reconstruct-in a sense-the context of the application.
Typical examples of such context aware systems are the extraction of Origin-Destination
matrices from cellphone based data [131], the mobility applications that aim to improve
the the mobility footprint of users [132], as well as the smartphone based driving insurance
systems [133]. Although these approaches seem to be appropriate to complement the user
or system’s experience on a problem, significant uncertainty lies in their transferability and
accuracy, owing to the lack of context-aware knowledge.
A certain degree of awareness of the context should be a matter of concern when
developing ITS models that intend to be actionable. Context aware information is usually
introduced in the modeling, for example accounting fro the demographic characteristics
of the application area, the type of the road or network, the mode, the travel purpose etc.
However, what is usually disregarded is a much broader consideration of the operational
and system’s characteristics, such as how models can be introduced to the operations at
hand, what the privacy concerns are with respect to data and information flows, what is
the regulatory framework and policy level restrictions and goals to be reached.
Sensors 2021, 21, 1121 17 of 34

First, within the operation, the deployment context where a developed model is
intended to be implemented can enforce a series of operative constraints. Creating and
proposing an ITS model without observing these requirements is an exercise of futility,
for its lack of actionability. From this operation perspective, the context covers from
deployment and operation costs—is the system cost-efficient considering its potential
service?—to functioning modes—has the model the expected response times? can it operate
in reduced computational power environments? As an illustration, a system designed to
detect and identify pedestrians can be very effective in terms of performance, but if it does
not operate at an appropriate speed, or it needs more demanding computations that cannot
be embarked in a vehicle, it is useless for an autonomous driving context [134]. A similar
reasoning holds if by operation cost one thinks about the energy consumption of the model
at hand. Questions such as whether the energy consumed by the model compliant with
the system should be kept in mind at design time, but also from the academic perspective,
where efforts should be directed to the development of models that are consequent with
the actual operative circumscription.
Second, regulations constitute a hard and highly contextual constraint in the imple-
mentation of ITS. Besides the wide regulatory differences that can be found across regions,
there are transport frameworks where regulations are specially rigid. A typical example
is the case of airports [135], and where there is a broad field for specialized ITS. Another
example is the constantly rising use of drone systems to monitor traffic [136]. Models that
fail to relate to the application’s regulatory environment are not actionable.
Third, data privacy and sovereignty constitute a growing concern in a connected
world where, after a decade of handing over data with complacency, an awareness about
personal information sharing is springing. A recent example is the introduction of the
EU General Data Protection Regulation (GDPR) framework, that severely disputed the
manner data were introduced to models, as well as data availability. ITS models that are
based on personal data are common nowadays, for instance in floating car data based
developments [137]. However, there are fields where this aspect is becoming crucial
(autonomous driving connectivity [138], security in public transport environments [139]),
and research is steering to privacy-preserving approaches [140], spheres where technologies
such as Blockchain can have a major dominance [141,142].
Fourth, social aspects of the application play a major role in modeling. Social trans-
portation is the subfield in ITS where the “social” information coming from mobile devices,
wearable devices and social media is used for a number of ITS management related applica-
tions [143]. The outcomes from social transportation may be, to name a few, traffic analysis
and forecasting [144,145], transportation based social media [146], transportation knowl-
edge automation in the form of recommending systems and decision support systems [147],
and services for the collection of further signal to be used later for the already mentioned
purposes or others. However, cultural differences can have a relevant impact in how
these systems operate, as social data are most commonly strongly linked to geographical
information. This is a key aspect for their actionability.
Fifth, transportation is currently a large source of greenhouse-gas emissions [148].
These concerns are gaining momentum in a wide range of ITS applications, such as the dis-
covery of parking spots [149], multimodality applications that grant travelers the chance of
using collective transportation systems efficiently and conveniently [150], the improvement
of logistics operations [151], shared mobility applications, which help reducing the number
of one-passenger vehicles in the road network [152], or driving analytics to improve safety
and ecological footprint [153–155].
Of course, research goes beyond the application context and does not need to be
always connected to a certain application scenario. A prototype can be far from the
practical requirements of its eventual deployment; still, knowing the essential application
common grounds is key to converge to actionable models. Unfortunately, this is a matter
frequently disregarded in ITS research.
Sensors 2021, 21, 1121 18 of 34

3.5. Transferability
Within the research context, it is common to employ test data to assess the models.
Regardless if these data are obtained from real sources or synthetically generated, the result-
ing models have been built around them, and can be heavily linked to that experimentation
context. Would these models work in other context or with other input data? Transferability
could be defined as the quality of a data-driven model to be applied in other environment
with other data, and it is directly linked with actionability: the application of a model
should be generalizable to different datasets and transportation settings. This definition
stems from the more general concept of Transfer Learning [156], which can entail that models
trained in a certain domain are applied to other domains, so that the previous knowledge
obtained from the first makes them perform better in the latter than models without it.
Depending on the subcategory of ITS, this requirement can be easily met or arduous to
achieve, as some subcategories are more oriented towards the application and rely less on
the environment than others; the key is defining what is environment. For example, a travel
time forecasting model developed with data of a certain location could be transferable to
another location without great complications, if it is built considering this feature [157].
In fact, many ITS models that are spatial-sensitive are developed using real data, but within
the experimentation context, they are evaluated only in certain locations. Transferability for
these scenarios would imply that the obtained results are reproducible (with certain degree
of tolerance) in other locations.This could entail from plainly extrapolating the model to
other locations [158], to implementing of techniques such as soft-sensing, aimed at modeling
situations where no sensor is available [159], and the environment information is enough
to obtain these models. A similar case in terms of spatial contexts, but with more parameter
complexity, requires plenty of information about the environment. As an illustration,
the case of crash risk estimation implies a higher calibration and adjustment needs due
to the higher number of parameters that take part in this type of estimations. In these
circumstances, works such as [160] or [161] work with posterior probability models and
give more relevance to models that behave with a certain performance in many contexts
than to models that perform better in a particular location. On the other extreme, for cases
like autonomous driving, the change of environment is connatural to the domain (a moving
vehicle constantly changing its location), and the parameters of these models are abundant
and highly variable. Thus, these applications need transferable solutions, transferability
that is specifically sought by researchers, for instance in LIDAR based localization [162] or
pedestrian motion estimation [163]. In any case, and regardless the domain, ITS research is
in an incipient stage (probably with the exception of autonomous driving) of developing
transferable models, and evaluating this feature, and some machine learning paradigms
can help improve this characteristic.

4. Emerging AI Areas towards Actionable ITS


We have hitherto elaborated on the requisites that a model should meet towards
leading to actionable data-based insights in ITS applications and processes. Some of these
requirements can be fulfilled by properly designing the data-based workflow (e.g., inter-
pretability can be straightforward for certain prediction models, whereas adaptability can
be enforced by periodically scheduling the learning algorithm under use and feeding it
with new data). However, several research areas have stemmed in the last years from
the wide fields of Data Science and Artificial Intelligence that may serve to catalyze the
compliance of data-based ITS workflows with the prescribed requisites, and thereby attain
the sought actionability of their produced insights.
The main AI areas that have been identified as potentially appropriate for addressing
the requirements can be summarized briefly as follows:
• Real-time data processing and online learning, which are not brand new research
avenues in ITS, as we can find advanced developments in the literature. However, as
we will later show, emerging fields with great potential such as dynamic data fusion
Sensors 2021, 21, 1121 19 of 34

and dynamic optimization can expedite and proliferate the adoption of incremental
data-based models in more ITS-related applications.
• Transfer learning and domain adaptation, that could allow to develop models for
certain contexts and export them to others, linking directly to the transferability
requirement, but also to the integration of transportation theories and physical models
to data-based models.
• Gray-box modeling, a paradigm halfway between white-box (physical) and black-
box (data-based) models. Gray-box modeling represents a promising area to bring
awareness to traffic theory and other physical modeling when developing data-based
models, with the potential to increase the performance, usability and comprehensibil-
ity of the latter.
• Green AI, a trend in Artificial Intelligence research that connects directly with en-
ergy and cost efficiency. Developing efficient models has a relevant impact in their
sustainability and context awareness.
• Fairness, Accountability, Transparency and Ethics: Data-based models—specially
those learning from large amounts of diverse data from many sources—are fragile to
biases, and can compromise aspects such as the fairness of decisions or the differential
privacy of data. In this context of growing sources of data, including those gathered
from people, and increasingly opaque data-based models, it has become essential to
understand what models have learned from data, and to analyze them beyond their
predictive performance to consider ethical, societal and legal aspects. These aspects
have been scarcely considered in ITS research.
• Other Artificial Intelligence areas such as imbalanced learning, reinforcement learning,
adversarial machine learning are later highlighted for their noted relevance in ITS.
We next discuss on the research opportunities spurred by the above research lines,
their connections with the requirements presented in Section 3 (shown in Figure 3), as well
as the challenges that stem from the consideration of these AI areas in the context of ITS.
NG

NG
NI
X

AR

NI
BO

NG
NG

AR
LE
AY

NI
LE
NI

ER

AR
GR

AR

D
SF

TE

LE
LE

AN

RA

T
NE

TR

HO
DE
LI

-S
FE
ON

W
FE

DATA ACTIONS
AI

G
IN
N

DM

RN
EE
I
xA

GR

A
G

LE
N
VI

D
ER

TRAFFIC THEORY AWARENESS


CE
ES

AN

TRANSFERABILITY
PR

L
BA
Y

APPLICATION CONTEXT AWARENESS


AC

IM

SELF-SUSTAINABILITY
IV
PR

USABILITY

Figure 3. Schematic diagram showing how avant-garde AI subareas can promote actionability in ITS data-based modeling
workflows. Subareas contributing with particular emphasis to different functional requirements are connected together
along the way from data to actions.

4.1. Online Learning and Dynamic Data Fusion/Optimization


Previously sketched in Section 3.2, by online learning we refer to the capability of the
learning model and in general, of the entire workflow, to learn from fastly arriving data
possibly produced by non-stationary phenomena that enforces a need for adapting the
knowledge captured by the model along time. Changes over data streams can make the
Sensors 2021, 21, 1121 20 of 34

data pipeline obsolete, thus demanding active or passive techniques to update it with the
characteristics of the stream [7,97].
Although activity around online learning has mostly revolved on certain clustering
and classification paradigms (the latter giving rise to the so-called concept drift term to
refer to pattern changes), it is important to note that adaptation can be also needed in
other stages of the actionable data-based workflow, from data fusion to the prescription of
actions. This being said, research areas such as dynamic optimization and dynamic multi-
sensor data fusion should be also investigated deeply in future studies related to actionable
data-based models, specially when the scenario under analysis can produce information
with non-stationary statistical characteristics. When merging different data sources, fusion
strategies at different levels can be designed and implemented, from traditional means
(data-level fusion, knowledge-level fusion) to modern methods (corr. model-based fusion,
federated learning or multiview learning) [164,165]. Fusion of correlated data sources can
compensate for missing entries or noisy instances in static environments. However, when
data evolve over time as a result of their non-stationarity, new challenges may arise in
regards to the inconsistency among multiple information sources, including measurement
discrepancy, inconsistent spatial and temporal resolutions, or the timeliness/obsolescence
of the data flows to be merged, among other issues. For this reason, close attention
should be paid to advances reported around adaptive fusion methods capable of detecting,
counteracting and correcting misalignments between data flows that occur and evolve over
time. This branch of dynamic data fusion schemes aims at combining together information
flows produced by non-stationary sources, synthesizing a representation of the recent
history of each of the flows to be merged into a set of more coherent, useful data inputs
to the rest of the data-based pipeline [166,167]. On the other hand, dynamic optimization
techniques can efficiently deliver optimized actionable policies when the objectives and/or
constraints of the underlying optimization problem varies [168,169]. We energetically
advocate for a widespread embrace of advances in these fields by the ITS community,
emphasizing on those scenarios whose dynamic nature can make the obtained actionable
insights eventually obsolete. This is the case, for instance, of traffic related modeling
problems (e.g., traffic flow forecasting and optimal routing) or driver characterization for
consumption minimization, among many others.
Other requirements for actionability can also benefit from the adoption of the above
models in dynamic ITS contexts. For instance, cost efficiency in terms of energy consump-
tion can largely harness the incrementality that often features an online learning model.
The use of dynamic data fusion can also yield a drastically less usage of communication
resources in wireless V2V links, such as those established in cooperative driving scenarios.
All in all, the recent literature poses no question around the relevance of adaptation in data-
based modeling exercises noted in this work, with an increasing volume of contributions
dealing with the extrapolation of adaptation mechanisms to ITS problems [170–172].

4.2. Transfer Learning and Domain Adaptation


In close semantics to its related actionability requirement (transferability), transfer
learning aims at deriving novel means to export the knowledge captured by a data-based
model for a given task to another task with different inputs and/or outputs [156]. Depend-
ing on the amount of alikeness between the origin task and the destination task, we may be
also referring to domain adaptation, by which we adapt the model built to perform a certain
task to make it generalize better when processing new unseen inputs that do not follow
the same distribution as their original counterparts (only the distribution changes [173]).
Techniques such as subspace mapping, representation learning, of feature weighting arise
as those methods most used to allow knowledge to be transferred between data-based
models used for prediction.
In essence, transfer learning can provide higher prediction accuracy for models whose
number of parameters to be learned (e.g., weights in a Neural Network) demands higher
amounts of labeled data than those available in practice. However, data augmentation
Sensors 2021, 21, 1121 21 of 34

is not the only goal targeted by transfer learning. Domain adaptation may yield a better
performance when used between ITS models that can become severely affected by a lack of
calibration, different configurations or diverging specifications. An immediate example
illustrating this hypothesis is the use of camera sensors for vehicular perception. Models
trained to detect and identify objects in the surroundings of the vehicle can fail if the images
provided as their inputs are produced by image sensors with new specs. The same holds
for car engine prognosis: replaced components can make a data-based characterization
of the normal operation of the engine be of no practical use unless a domain adaptation
mechanism is applied. Personalization of ITS services can be another problem where
domain adaptation can help refine a model trained with data from many sources: a clear
example springs from naturalistic driving, where a behavioral characterization model built
at first instance from driving data produced by many individuals (source domain) can be
progressively specialized to the particular driver of the car where it is deployed [174–176].
In regards to actionability, several functional requisites can be approached by using
elements from Transfer Learning over the data-based pipeline. To begin with, it should be
clear that the transferability of learned models for their deployment in different locations
and contexts could be vastly improved by Transfer Learning, as the purpose of this AI
branch is indeed to meet this requirement in data-based learning models. In fact, this ap-
proach is currently under study and wide adoption within the ITS community working on
vehicular perception: when the capability of the vehicle to sense and identify its surround-
ing hinges on learning models (e.g., Deep Learning for image segmentation with cameras),
a plethora of contributions depart from pretrained models, which are later particularized
for the problem/scenario at hand [177]. This exemplified use case supports our advocacy
for further efforts to incorporate transfer learning methods in other ITS applications, spe-
cially those where data collection and supervision are not straightforward to achieve in
practice. Another functional requirement where Transfer Learning can pose a difference in
ITS developments to come is cost efficiency. The knowledge transferred between models
learned from different contexts can improve their performance, thereby reducing the need
for supervising data instances and ultimately, the time, costs and resources required to
perform the data annotation.
Finally, the more recent paradigm coined as Federated Learning refers to the privacy-
preserving exchange of captured knowledge among models deployed in different con-
texts [178,179]. Although the main motivation for the initial inception of Federated Learn-
ing targeted the mobile sector, techniques supporting the federation of distributed data-
based models can be of utmost importance in the future of ITS, specially for V2V com-
munications among autonomous vehicles and in-vehicle ATIS systems. Definitely the
enrichment of models with global knowledge about the data-based task(s) at hand will
pose a differential breakthrough in vehicular safety and driving experience. For instance,
federated models can collectively identify, assess and countermeasure the risk of more
complex vehicular scenarios than each of them in isolation [180]. Likewise, ATIS systems
can learn from the preferences and habits of other users to better anticipate the preferences
of the driver and act accordingly [181]. In a few words: an enhanced and more effective
actionability of the data-based workflows built to undertake such tasks.

4.3. Gray-Box Modeling


Gray-box modelling refers to the design of models that combine theoretical devel-
opments and structures related to the problem, with data that serve as a complement for
such theories to make the overall model match better the scenario under analysis [182,183].
Gray-box models lie in between white-box models, for which the learned structure is
deterministic and grounded in theoretical concepts; and black-box models, whose internal
structure lacks physical significance and is learned from data. An example of white-box
model in ITS systems is the use of computational fluid dynamics for macroscopic traf-
fic flow modeling, whereas Deep Learning models for traffic forecasting can exemplify
black-box modeling in this domain. Gray-box models have been lately embraced by the
Sensors 2021, 21, 1121 22 of 34

ITS community in a number of modeling scenarios, such as those combining biological


concepts and data-based models for driver characterization [184,185].
Gray-box modeling can contribute to the actionability of data-based workflows for ITS
applications in two different albeit interconnected directions. To begin with, the incorpora-
tion of theoretical models to data-based pipelines can narrow the gap between engineers
and practitioners more acquainted with traditional tools to analyze ITS systems and pro-
cesses. Indeed, hybrid modeling can tie both worlds together not only without questioning
the validity of prevalent theoretical developments, but also evincing the complementarity
and synergy of both approaches. On the other hand, using validated theoretical models
can help data-based modeling overcome difficult learning contexts such as class imbalance,
outlier characterization or the partial interpretability of data clusters, among others.

4.4. Green Artificial Intelligence


A profitable strand of literature has recently stressed on the energy efficiency of data-
based models, highlighting the need for redesigning their learning algorithms to minimize
their energy consumption and thereby, make them implementable and usable in prac-
tice [186–188]. While this issue is particularly relevant for resource-constrained devices
(e.g., mobile hand-helds), the concern with energy efficiency goes beyond usability towards
environmental friendliness. For this reason many recent contributions are striving for com-
putationally lightweight variants of machine learning models that sacrifice performance
for a notable reduction of their energy demand. This is not only the case of predictive
models capable of incrementally learning from data, but also of specific Deep Learning
architectures tailored for their deployment on embedded devices [189].
Based on the above rationale, cost efficiency is arguably the most evident functional
requirement around which energy-aware model designs can pose a breakthrough towards
improving the actionability of the overall data-based workflow. In addition, other aspects
can be made more actionable by using energy-aware model designs, such as usability [190].
Despite achieving unprecedented levels of predictive accuracy, a data-based workflow
may become useless should it deplete the battery of the system on which it is deployed
for operation. Therefore, energy efficiency should be under the target of future research
efforts, specially when dealing with ITS applications running on battery-powered devices,
inspecting interesting paths rooted thereon such as the trade-off between performance and
energy consumption, or the adaptation of the model’s operation regime depending on the
remaining battery life, among others [191].

4.5. Fairness, Accountability, Transparency and Ethics


To end with, the prescription of actions based on the insights provided by a data-based
pipeline must be buttressed by a thorough understanding of the mechanisms behind its
provided decisions [192]. Extended information about the model must be presented to the
end user for several reasons:
• To gauge as many consequences of the actions as possible, identifying situations
where decision making based on the outputs of the data-based workflow gives rise to
socially unfair scenarios due to the propagation of inadvertently encoding bias to the
automated decisions of the model.
• To ensure him/her that the output of the model is reliable and invariant under the
same data stimuli, maintaining a record of the intermediate decisions made along the
pipeline, allowing for the post-mortem, potentially correcting analysis of bad decision
paths, and thereby maximizing the trust and certainty of the user when embracing
its output.
• To make the user understand why the developed model produces its prescriptive
output when fed with a set of data inputs, shedding light on which inputs correlate
more significantly with the prescribed actions, tracing back causal relations between
intermediate data inputs, and discriminating extreme cases where decisions can
change radically under slight modifications of the model inputs.
Sensors 2021, 21, 1121 23 of 34

• To supervise the ethics of data-based workflows, identifying potentially illegal uses


of unlawful data given the prevailing legislation, guaranteeing the privacy and gov-
ernance of personal data by third-party data-based ITS applications and processes,
and certifying that the output of the model’s output does not favor inequalities in
terms of gender, religion, race or any other aspect alike.
The above requirements have been lately collectively compiled under the FATE (Fair-
ness, Accountability, Transparency and Ethics) concept, which refers to the design of
actionable data-based pipelines whose internal operations can be explained, accounted and
critically examined in regards to the consequences of their eventual bias in privacy, fairness
and ethical issues [193–195]. This recent concern with the operation of machine learning
models spawns from the proliferation of real cases where practical model installments have
unveiled deficiencies of different kind, from differential privacy breaks (data revealing
the identity of the persons to whom they belong) to unnoticed output bias that caused
racist discriminatory issues [196]. For instance, data-based models for vehicular perception,
obstacle detection and avoidance must be also endowed with ethics and legal design factors
to make the overall decision not just drifted by the data themselves. Another clear domain
where FATE can be crucial is modeling with crowd-sourced Big Data, where aspects like
privacy preservation [197] and bias avoidance [198] are arguably more critical [199,200].
The construction of the data-based modeling workflow must (i) ensure that protected
features remain as such once the workflow has been built, without any chance for re-
verse engineering (via e.g., XAI techniques [90]) that could compromise the differential
privacy of data; and (ii) that learning algorithms along the workflow counteract hidden
bias in data that could eventually lead to discriminatory decisions (due to skewed samples,
tainted annotation, limited data sizes or imbalanced data). From our perspective, these
are among the most concerning challenges in the exploitation of Big Data in ITS, and the
main source of motivation for a number of recent studies in areas related to data-driven
transportation systems such as pedestrian detection [201], autonomous vehicles [202,203]
or urban computing [204]. Bias-related issues can be identified by a proper analysis of the
decisions made by the workflow, which in turn requires models to be accountable and
transparent enough to thoroughly characterize their sensitivity to bias, and how inputs and
outputs (decisions) correlate in regards to protected features. It is also remarkable to note
that several proposals have been made to quantify fairness in machine learning pipelines,
yielding useful metrics that account for the parity of models when processing groups of
inputs [205,206]. Without these aspects being considered jointly with performance mea-
sures, data-based ITS developments in years to come are at the risk of being restricted to
the academia playground [207].

4.6. Other AI Research Areas Connected to Actionability


The above areas have been highlighted as the main propellers for model actionability
in ITS systems. However, it is worthwhile to mention other research areas from the AI
realm that can also help completing the chain from data to actions:
• Few-shot learning [208], which aims at overcoming the lack of reliably annotated data
and the practical difficulty of performing annotation in certain application scenarios.
For instance, accident prevention models cannot be enriched with positive samples
unless a fatality occurs and the data captured in place is fed into the model. Few shots
learning and related subareas (zero-shot, one-shot) deriving solutions that can auto-
matically learn from very small amounts of training data, incorporating mechanisms
(e.g., generative models, regularization techniques, guided simulation) to prevent the
overall model from overfitting [209]. In regards to actionability, this family of learning
techniques can be helpful to make data-based ITS models deployable in situations
lacking data supervision, specially when such a data annotation cannot be guaranteed
to be achievable over time.
• Imbalanced and cost-sensitive learning [210,211], which link to the need for avoiding
model bias, not only to ensure the generalization of its output, but also to reduce the
Sensors 2021, 21, 1121 24 of 34

likeliness of the workflow to cause discriminatory issues as the ones exemplified above.
The history of these AI areas in the ITS community has been going for years now [3].
However, we here emphasize the crucial role of these techniques beyond performance
boosting: the techniques originally aimed to counteract the effects of class imbalance
in the output of data-based models could be also leveraged to reflect legal impositions
that not necessarily relate to the model’s performance nor can they be inferred easily
for the attributes within the data themselves. The lack of compliance of the model with
fairness and ethics standards does not necessarily render a performance degradation
observed at its output, nor can it be inferred easily from the available data.
• Hybrid models encompassing linguistic rules and data-based learning techniques,
capable of supporting the transition from the traditional way of doing to the new data-
based modeling era in the management of ITS systems. We foresee that the community
will witness a renaissance of data mining methods incorporating methods such as
fuzzy logic not only to implement human knowledge to decision workflows, but also
to explain and describe the internal structure of learned models, as it is currently
under investigation in many contributions under the XAI umbrella [212,213].
• New prescriptive data-based techniques such as Deep Reinforcement Learning [214]
and Algorithmic Game Theory [215] will also grasp interest in the near future for their
close connection to actionable data science. The interaction of data-based workflows
with humans will require techniques capable of learning actions from experience,
and eventually orchestrating the interaction and negotiation among users when
their actions are governed by interrelated yet conflicting objectives. In fact such
new prescriptive elements are progressively entering the literature in certain ITS
applications that target machine autonomy (e.g., autonomous vehicle [216,217] or
automated signaling [51]), but it is our vision that they will gain momentum in many
other ITS setups.
• Privacy-preserving Data Mining [218,219], which has garnered a great interest in the
last year with major breakthroughs reported in the intersection between machine learn-
ing, cryptography, homomorphic encryption, secure enclaves and blockchains [220].
The use of personal data and the stringent pressure placed by governments and agen-
cies on differential privacy preservation has spurred a flurry of research to prevent
models from revealing sensitive data from their training instances [197,221]. Within
the ITS domain, it is possible to find many areas in which privacy preservation has re-
cently been a subject of intense research: from origin-destination flow estimation [222]
to route planners [223,224], or pattern mining [225], a glance at recent literature re-
veals the momentum this topic has acquired lately. In any of these examples data are
available as a result of the sensing pervasiveness (specially in the case of VANETs)
and the capture of user data. While previous works explored how to used these data
in a proper way with respect to privacy matters, it is straightforward to think that
the natural evolution of this research line arrives at how protected data is preserved
through the modeling workflow.
• Furthermore, the proven vulnerability of data-based models against adversarial at-
tacks has also motivated the community to lay the foundations of an entirely new
research area—Adversarial Machine Learning—, committed to the design of robust
models against attacks crafted to confuse their outputs [226,227]. Interestingly, one
of the most widely exemplified scenarios in this research area relates to ITS: auto-
mated traffic signal classification models were proven to be vulnerable to adversarial
attacks by placing a simple, intelligently designed sticker on the traffic sign itself [228].
Likewise, the rationale behind Federated Learning (discussed in Section 4.2) also
spans beyond the efficient distribution of locally captured knowledge among mod-
els: since no raw data instances are involved in the information transfer, privacy of
local data is consequently preserved. In short: security also matters in actionable
data-based pipelines.
Sensors 2021, 21, 1121 25 of 34

• Finally, the ever-growing scales of ITS scenarios demand more research invested
in scaling up learning algorithms in a computationally efficient manner [229]. Au-
tomated traffic, smart cities, mobility as a service constitute ITS scenarios where a
plethora of information sources interact with each other. Definitely more efforts must
be invested in aggregation strategies for data-based models learned from different
interrelated data ecosystems, either in a distributed fashion (e.g., federated learning)
or in a centralized system (correspondingly, Map-Reduce implementations of data-
based models, cloud-based architectures, etc). Computational aspects of large-scale
implementations should be also under study due to their implications in terms of
actionability, such as the latency of the system when prescribing decisions from data.
This latter aspect can be a key for real-time ITS applications for which the gap from
data to actions must be shortened to its minimum.

5. Concluding Notes and Outlook


This work has built upon the overabundance of contributions within the ITS commu-
nity dealing with performance-based comparisons among data-based models. Our claim is
that, as in any other domain of application, data-based modeling should bridge the gap
between data and actions, providing further value to the ITS application at hand than
superior model performance statistics. It is our firm belief that the research community
should embrace actionability as the primary design motto, with negligible performance
improvements being left behind in favor of relevant aspects such as adaptability, usability,
resiliency, scalability or efficiency.
To provide a solid rationale for our postulations, we have first presented a reference
model for actionable data-based workflows, placing emphasis on the different phases
that should be undertaken to translate data into actions of added value for the decision
maker. Adaptation has been highlighted as a necessary albeit often neglected processing
step in data-based modeling, which allows models to be effective when deployed on
dynamic ITS environments with time-varying data sources. Next, our study has listed
the main functional requirements that models along the reference model should meet to
guarantee their actionability, followed by an overview of incipient research areas in Data
Science and Artificial Intelligence that should progressively enter the ITS arena. Indeed,
advances in XAI, Online Learning, Gray-box Modeling and Transfer Learning are currently
investigated mostly from an application-agnostic perspective. Their undoubted connection
to actionability makes them the core of a promising future for data-based modeling in ITS
systems, processes and applications.
Other research areas related to Artificial Intelligence beyond those covered in our
reflections will surely spawn further opportunities for actionability in ITS, provided that
they fully embrace their ultimate goal: to effectively support decision making. Among
them, the use of Automated Machine Learning (AutoML [230]) for tuning data-based
models should not only optimize performance-based metrics (e.g., finding a model that
attains maximum accuracy for image segmentation in vehicular perception cameras), but
also comply with other objectives and constraints that closely link to actionability (e.g., ro-
bustness against adversarial attacks, or a lower epistemic uncertainty of the model induced
in its output). Unless all such actionability constraints are regarded as design objectives
and accounted for as such in the automated discovery of new data-based pipelines, any
incursion of AutoML in ITS will be of no practical value. For this to occur, it is our belief
that the confluence of multiple functional and non-functional requirements in this auto-
mated design process will pave the way towards the massive adoption of multi-objective
optimization algorithms as a massive framework to infer and analyze all trade-offs existing
among the design objectives.
Data-based modeling has brought a deep transformation to ITS. A vast amount of
research works in the field are produced by data-based modeling specialists attracted by
the profusion of available data, and with limited knowledge of transportation. Data-based
models are getting progressively more complex, increasing the gap between research and
Sensors 2021, 21, 1121 26 of 34

practice. This situation calls for a change of paradigm, to a one in which actionability
requirements of models is desired by researchers, and practitioners are aware of the tech-
nologies available to provide it. Model actionability is a great whole that can act as an
incentive to perform smaller steps towards its realization. It is probably unthinkable to
develop, in a research environment, a data-based model that meets all proposed require-
ments. However, addressing some of the postulated requirements while developing a
competing data-based ITS model will make it closer to actionability. There is, therefore,
a long road to be travelled in ITS model actionability, with interesting avenues around
the thorough understanding of models, and the adoption of emerging AI technologies
to endow data-based workflows with the requirements needed to make them actionable
in practice. As exposed in our study, there is a germinal interest in these research topics.
Nevertheless, we foresee vast opportunities for future work when model actionability is
set as a design priority.
On a closing note, we advocate for a new dawn of Data Science in the ITS domain,
where advances in modeling performance concurrently emerge along with histories and
reports about how such models have helped decision making in practical scenarios. Data
mining has limited merit without actions prescribed from its outputs, always in compliance
and close match with the specificities of its context.

Author Contributions: I.L.: Conceptualization, Methodology, Investigation, Writing—Original Draft,


Writing—Review & Editing. J.J.S.-M.: Conceptualization, Methodology, Investigation, Validation,
Writing—Original Draft, Funding acquisition. E.I.V.: Conceptualization, Methodology, Writing—Original
Draft, Funding acquisition. J.D.S.: Conceptualization, Methodology, Investigation, Supervision, Writing—
Review & Editing, Funding acquisition. All authors have read and agreed to the published version of
the manuscript.
Funding: This work was supported in part by the Basque Government for its funding support
through the EMAITEK program (3KIA, ref. KK-2020/00049). It has also received funding support
from the Consolidated Research Group MATHMODE (IT1294-19) granted by the Department of
Education of the Basque Government.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data sharing not applicable.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Zhu, L.; Yu, F.R.; Wang, Y.; Ning, B.; Tang, T. Big data analytics in intelligent transportation systems: A survey. IEEE Trans. Intell.
Transp. Syst. 2018, 20, 383–398. [CrossRef]
2. Albino, V.; Berardi, U.; Dangelico, R.M. Smart cities: Definitions, dimensions, performance, and initiatives. J. Urban Technol. 2015,
22, 3–21. [CrossRef]
3. Zhang, J.; Wang, F.Y.; Wang, K.; Lin, W.H.; Xu, X.; Chen, C. Data-driven intelligent transportation systems: A survey. IEEE Trans.
Intell. Transp. Syst. 2011, 12, 1624–1639. [CrossRef]
4. Karlaftis, M.G.; Vlahogianni, E.I. Statistical methods versus neural networks in transportation research: Differences, similarities
and some insights. Transp. Res. Part C Emerg. Technol. 2011, 19, 387–399. [CrossRef]
5. Del Ser, J.; Osaba, E.; Sanchez-Medina, J.J.; Fister, I. Bioinspired Computational Intelligence and Transportation Systems: A Long
Road Ahead. IEEE Trans. Intell. Transp. Syst. 2019, 21, 466–495. [CrossRef]
6. Eiben, A.E.; Smith, J.E. Introduction to Evolutionary Computing; Springer: Berlin/Heidelberg, Germany, 2003; Volume 53.
7. Ditzler, G.; Roveri, M.; Alippi, C.; Polikar, R. Learning in nonstationary environments: A survey. IEEE Comput. Intell. Mag. 2015,
10, 12–25. [CrossRef]
8. Said, H.; Nicoletti, T.; Perez-Hernandez, P. Utilizing telematics data to support effective equipment fleet-management decisions:
Utilization rate and hazard functions. J. Comput. Civ. Eng. 2016, 30, 04014122. [CrossRef]
9. Urbahs, A.; Žavtkēvičs, V. Remotely Piloted Aircraft route optimization when performing oil pollution monitoring of the sea
aquatorium. Aviation 2017, 21, 70–74. [CrossRef]
10. Khaksar, H.; Sheikholeslami, A. Airline delay prediction by machine learning algorithms. Sci. Iran. 2019, 26, 2689–2702.
[CrossRef]
Sensors 2021, 21, 1121 27 of 34

11. Mott, J.H.; Bullock, D.M.; McNamara, M.L. Estimating Aircraft Operations at Airports Using Transponder Data. US Patent
Application No. 15/248,581, 18 May 2017.
12. Herring, R.; Hofleitner, A.; Abbeel, P.; Bayen, A. Estimating arterial traffic conditions using sparse probe data. In Proceedings
of the 13th International IEEE Conference on Intelligent Transportation Systems, Madeira, Portugal, 19–22 September 2010;
pp. 929–936.
13. Kujala, R.; Aledavood, T.; Saramäki, J. Estimation and monitoring of city-to-city travel times using call detail records. EPJ Data
Sci. 2016, 5, 1–16. [CrossRef]
14. Sun, Z.; Hao, P.; Ban, X.J.; Yang, D. Trajectory-based vehicle energy/emissions estimation for signalized arterials using mobile
sensing data. Transp. Res. Part D Transp. Environ. 2015, 34, 27–40. [CrossRef]
15. Rodrigues, J.G.; Aguiar, A.; Vieira, F.; Barros, J.; Cunha, J.P.S. A mobile sensing architecture for massive urban scanning.
In Proceedings of the 2011 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), Washington, DC,
USA, 5–7 October 2011; pp. 1132–1137.
16. Laña, I.; Del Ser, J.; Velez, M.; Vlahogianni, E.I. Road Traffic Forecasting: Recent Advances and New Challenges. IEEE Intell.
Transp. Syst. Mag. 2018, 10, 93–109. [CrossRef]
17. García, S.; Luengo, J.; Herrera, F. Data Preprocessing in Data Mining; Springer: Berlin/Heidelberg, Germany, 2015.
18. Lopes, J.; Bento, J.; Huang, E.; Antoniou, C.; Ben-Akiva, M. Traffic and mobility data collection for real-time applications. In Pro-
ceedings of the 13th International IEEE Conference on Intelligent Transportation Systems, Madeira, Portugal, 19–22 September
2010; pp. 216–223.
19. Vlahogianni, E.I.; Golias, J.C.; Karlaftis, M.G. Short-term traffic forecasting: Overview of objectives and methods. Transp. Rev.
2004, 24, 533–557. [CrossRef]
20. Chen, H.; Grant-Muller, S.; Mussone, L.; Montgomery, F. A study of hybrid neural network approaches and the effects of missing
data on traffic forecasting. Neural Comput. Appl. 2001, 10, 277–286. [CrossRef]
21. Qu, L.; Li, L.; Zhang, Y.; Hu, J. PPCA-based missing data imputation for traffic flow volume: A systematical approach. IEEE Trans.
Intell. Transp. Syst. 2009, 10, 512–522.
22. Tan, H.; Feng, G.; Feng, J.; Wang, W.; Zhang, Y.J.; Li, F. A tensor-based method for missing traffic data completion. Transp. Res.
Part C Emerg. Technol. 2013, 28, 15–27. [CrossRef]
23. Li, Y.; Li, Z.; Li, L. Missing traffic data: Comparison of imputation methods. IET Intell. Transp. Syst. 2014, 8, 51–57. [CrossRef]
24. Ran, B.; Tan, H.; Wu, Y.; Jin, P.J. Tensor based missing traffic data completion with spatial–temporal correlation. Phys. A Stat.
Mech. Appl. 2016, 446, 54–63. [CrossRef]
25. Laña, I.; Olabarrieta, I.I.; Vélez, M.; Del Ser, J. On the imputation of missing data for road traffic forecasting: New insights and
novel techniques. Transp. Res. Part C Emerg. Technol. 2018, 90, 18–33. [CrossRef]
26. Krempl, G.; Žliobaite, I.; Brzeziński, D.; Hüllermeier, E.; Last, M.; Lemaire, V.; Noack, T.; Shaker, A.; Sievi, S.; Spiliopoulou, M.; et al.
Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 2014, 16, 1–10. [CrossRef]
27. Etemad, M.; Soares Júnior, A.; Matwin, S. Predicting transportation modes of GPS trajectories using feature engineering and
noise removal. In Proceedings of the Advances in Artificial Intelligence: 31st Canadian Conference on Artificial Intelligence,
Canadian AI 2018, Toronto, ON, Canada, 8–11 May 2018; pp. 259–264.
28. Zheng, C.; Chen, S.; Wang, W.; Lu, J. Using principal component analysis to solve a class imbalance problem in traffic incident
detection. Math. Probl. Eng. 2013, 2013, 524861. [CrossRef]
29. Smith, B.L.; Babiceanu, S. Investigation of extraction, transformation, and loading techniques for traffic data warehouses. Transp.
Res. Rec. 2004, 1879, 9–16. [CrossRef]
30. El Faouzi, N.E.; Leung, H.; Kurian, A. Data fusion in intelligent transportation systems: Progress and challenges—A survey.
Inf. Fusion 2011, 12, 4–10. [CrossRef]
31. Choi, K.; Chung, Y. A data fusion algorithm for estimating link travel time. ITS J. 2002, 7, 235–260. [CrossRef]
32. Chang, B.R.; Tsai, H.F.; Young, C.P. Intelligent data fusion system for predicting vehicle collision warning using vision/GPS
sensing. Expert Syst. Appl. 2010, 37, 2439–2450. [CrossRef]
33. Han, L.; Wu, K. Radar and radio data fusion platform for future intelligent transportation system. In Proceedings of the 7th
European Radar Conference, Paris, France, 30 September–1 October 2010; pp. 65–68.
34. Treiber, M.; Kesting, A.; Wilson, R.E. Reconstructing the traffic state by fusion of heterogeneous data. Comput. Aided Civ.
Infrastruct. Eng. 2011, 26, 408–419. [CrossRef]
35. Vlahogianni, E.I. Enhancing predictions in signalized arterials with information on short-term traffic flow dynamics. J. Intell.
Transp. Syst. 2009, 13, 73–84. [CrossRef]
36. Laña, I.; Lobo, J.L.; Capecci, E.; Del Ser, J.; Kasabov, N. Adaptive long-term traffic state estimation with evolving spiking neural
networks. Transp. Res. Part C Emerg. Technol. 2019, 101, 126–144. [CrossRef]
37. Liu, T.; Hu, J.; Pei, X. Mining the Temporal-Spatial Patterns of Urban Traffic Demands Based on Taxi Mobility Data. In Proceedings
of the 19th COTA International Conference of Transportation Professionals, Nanjing, China, 6–8 July 2019; pp. 2716–2728.
38. Moretti, F.; Pizzuti, S.; Panzieri, S.; Annunziato, M. Urban traffic flow forecasting through statistical and neural network bagging
ensemble hybrid modeling. Neurocomputing 2015, 167, 3–7. [CrossRef]
39. Cong, Y.; Wang, J.; Li, X. Traffic flow forecasting by a least squares support vector machine with a fruit fly optimization algorithm.
Procedia Eng. 2016, 137, 59–68. [CrossRef]
Sensors 2021, 21, 1121 28 of 34

40. Kim, Y.J.; Hong, J.S. Urban traffic flow prediction system using a multifactor pattern recognition model. IEEE Trans. Intell. Transp.
Syst. 2015, 16, 2744–2755.
41. Fusco, G.; Colombaroni, C.; Comelli, L.; Isaenko, N. Short-term traffic predictions on large urban traffic networks: Applications
of network-based machine learning models and dynamic traffic assignment models. In Proceedings of the 2015 International
Conference on Models and Technologies for Intelligent Transportation Systems (MT-ITS), Budapest, Hungary, 3–5 June 2015;
pp. 93–101.
42. Montanino, M.; Punzo, V. Trajectory data reconstruction and simulation-based validation against macroscopic traffic patterns.
Transp. Res. Part B Methodol. 2015, 80, 82–106. [CrossRef]
43. Chaulwar, A.; Botsch, M.; Utschick, W. A hybrid machine learning approach for planning safe trajectories in complex traffic-
scenarios. In Proceedings of the 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA),
Anaheim, CA, USA, 18–20 December 2016; pp. 540–546.
44. Vlahogianni, E.I. Optimization of traffic forecasting: Intelligent surrogate modeling. Transp. Res. Part C Emerg. Technol. 2015,
55, 14–23. [CrossRef]
45. Teodorović, D. Swarm intelligence systems for transportation engineering: Principles and applications. Transp. Res. Part C Emerg.
Technol. 2008, 16, 651–667. [CrossRef]
46. Kumar, S.N.; Panneerselvam, R. A survey on the vehicle routing problem and its variants. Intell. Inf. Manag. 2012, 4, 66.
[CrossRef]
47. Osaba, E.; Yang, X.S.; Diaz, F.; Lopez-Garcia, P.; Carballedo, R. An improved discrete bat algorithm for symmetric and asymmetric
traveling salesman problems. Eng. Appl. Artif. Intell. 2016, 48, 59–71. [CrossRef]
48. Mendes-Moreira, J.; Moreira-Matias, L.; Gama, J.; de Sousa, J.F. Validating the coverage of bus schedules: A machine learning
approach. Inf. Sci. 2015, 293, 299–313. [CrossRef]
49. Szeto, W.; Jiang, Y.; Wang, D.; Sumalee, A. A sustainable road network design problem with land use transportation interaction
over time. Netw. Spat. Econ. 2015, 15, 791–822. [CrossRef]
50. Van Winden, K.; Biljecki, F.; Van der Spek, S. Automatic update of road attributes by mining GPS tracks. Trans. GIS 2016,
20, 664–683. [CrossRef]
51. Mannion, P.; Duggan, J.; Howley, E. An experimental review of reinforcement learning algorithms for adaptive traffic signal
control. In Autonomic Road Transport Support Systems; Springer: Berlin/Heidelberg, Germany, 2016; pp. 47–66.
52. Osaba, E.; Yang, X.S.; Diaz, F.; Onieva, E.; Masegosa, A.D.; Perallos, A. A discrete firefly algorithm to solve a rich vehicle routing
problem modelling a newspaper distribution system with recycling policy. Soft Comput. 2017, 21, 5295–5308. [CrossRef]
53. Imprialou, M.I.M.; Orfanou, F.P.; Vlahogianni, E.I.; Karlaftis, M.G. Methods for defining spatiotemporal influence areas and
secondary incident detection in freeways. J. Transp. Eng. 2013, 140, 70–80. [CrossRef]
54. Yu, C.; Wang, X.; Xu, X.; Zhang, M.; Ge, H.; Ren, J.; Sun, L.; Chen, B.; Tan, G. Distributed Multiagent Coordinated Learning for
Autonomous Driving in Highways Based on Dynamic Coordination Graphs. IEEE Trans. Intell. Transp. Syst. 2019, 21, 735–748.
[CrossRef]
55. Lécué, F.; Tallevi-Diotallevi, S.; Hayes, J.; Tucker, R.; Bicer, V.; Sbodio, M.L.; Tommasi, P. Star-city: Semantic traffic analytics and
reasoning for city. In Proceedings of the 19th International Conference on Intelligent User Interfaces, Haifa, Israel, 24–27 February
2014; pp. 179–188.
56. Gindele, T.; Brechtel, S.; Dillmann, R. Learning driver behavior models from traffic observations for decision making and
planning. IEEE Intell. Transp. Syst. Mag. 2015, 7, 69–79. [CrossRef]
57. Kammoun, H.M.; Kallel, I.; Casillas, J.; Abraham, A.; Alimi, A.M. Adapt-Traf: An adaptive multiagent road traffic management
system based on hybrid ant-hierarchical fuzzy model. Transp. Res. Part C Emerg. Technol. 2014, 42, 147–167. [CrossRef]
58. Mesbah, A. Stochastic model predictive control: An overview and perspectives for future research. IEEE Control. Syst. Mag. 2016,
36, 30–44.
59. Hrovat, D.; Di Cairano, S.; Tseng, H.E.; Kolmanovsky, I.V. The development of model predictive control in automotive industry:
A survey. In Proceedings of the 2012 IEEE International Conference on Control Applications, Dubrovnik, Croatia, 3–5 October
2012; pp. 295–302.
60. Buchanan, C. Traffic in Towns: A Study of the Long Term Problems of Traffic in Urban Areas; Routledge: Abingdon, UK, 2015.
61. Pan, B.; Zheng, Y.; Wilkie, D.; Shahabi, C. Crowd sensing of traffic anomalies based on human mobility and social media.
In Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems,
Orlando, FL, USA, 5–8 November 2013; pp. 344–353.
62. Davison, L.J.; Knowles, R.D. Bus quality partnerships, modal shift and traffic decongestion. J. Transp. Geogr. 2006, 14, 177–194.
[CrossRef]
63. Nielsen, J. Usability Engineering; Elsevier: Amsterdam, The Netherlands, 1994.
64. Nielsen, J. Usability inspection methods. In Proceedings of the Conference Companion on Human Factors in Computing Systems,
Boston, MA, USA, 24–28 April 1994; pp. 413–414.
65. Brooke, J. SUS-A quick and dirty usability scale. Usability Eval. Ind. 1996, 189, 4–7.
66. Nielsen, J. 10 Usability Heuristics for User Interface Design. 1995. Available online: https://www.nngroup.com/articles/ten-
usability-heuristics/ (accessed on 1 December 2018).
Sensors 2021, 21, 1121 29 of 34

67. Nielsen, J. Usability 101: Introduction to Usability. 2003. Available online: http://www.nngroup.com/articles/usability-101-
introduction-to-usability/ (accessed on 1 December 2018).
68. Noy, Y.I. Human factors in modern traffic systems. Ergonomics 1997, 40, 1016–1024. [CrossRef]
69. Green, P. Estimating compliance with the 15-second rule for driver-interface usability and safety. In Proceedings of the Human
Factors and Ergonomics Society Annual Meeting, Houston, TX, USA, 27 September–1 October 1999; Volume 43, pp. 987–991.
70. Green, P. Navigation System Data Entry: Estimation of Task Times. 1999. Available online: https://deepblue.lib.umich.edu/
bitstream/handle/2027.42/1288/94020.0001.001.pdf?sequence=2 (accessed on 2 September 2018).
71. Burns, P.; Harbluk, J.; Foley, J.P.; Angell, L. The importance of task duration and related measures in assessing the distraction
potential of in-vehicle tasks. In Proceedings of the 2nd International Conference on Automotive User Interfaces and Interactive
Vehicular Applications, Pittsburgh, PA, USA, 11–12 November 2010; pp. 12–19.
72. Burnett, G. ‘Turn right at the traffic lights’: The requirement for landmarks in vehicle navigation systems. J. Navig. 2000,
53, 499–510. [CrossRef]
73. Dos Santos, C.; Botura, G. Proposal of Ergonomic Intervention in Horizontal Traffic Signaling. In Proceedings of the International
Conference on Applied Human Factors and Ergonomics, Los Angeles, CA, USA, 17–21 July 2017; pp. 1121–1130.
74. Avelar, S.; Hurni, L. On the design of schematic transport maps. Cartogr. Int. J. Geogr. Inf. Geovis. 2006, 41, 217–228. [CrossRef]
75. Roberts, M.J.; Newton, E.J.; Canals, M. Radi (c) al departures: Comparing conventional octolinear versus concentric circles
schematic maps for the Berlin U-Bahn/S-Bahn networks using objective and subjective measures of effectiveness. Inf. Des. J.
2016, 22, 92–115.
76. Lyons, G. From Advanced Towards Effective Traveller Information Systems. In Travel Behaviour Research The Leading Edge;
Hensher, D., Ed.; International Association for Travel Behaviour Research: Pergamon, Turkey, 2001; Volume 47, pp. 813–826.
77. Barfield, W.; Dingus, T.A. Human Factors in Intelligent Transportation Systems; Psychology Press: Hove, UK, 2014.
78. Vlahogianni, E.I.; Karlaftis, M.G.; Golias, J.C. Short-term traffic forecasting: Where we are and where we’re going. Transp. Res.
Part C Emerg. Technol. 2014, 43, 3–19. [CrossRef]
79. Horan, T.A.; Abhichandani, T.; Rayalu, R. Assessing user satisfaction of e-government services: Development and testing
of quality-in-use satisfaction with advanced traveler information systems (ATIS). In Proceedings of the 39th Annual Hawaii
International Conference on System Sciences (HICSS’06), Kauai, Hawaii, 4–7 January 2006; Volume 4, p. 83b.
80. Ross, T.; Burnett, G. Evaluating the human–machine interface to vehicle navigation systems as an example of ubiquitous
computing. Int. J. Hum. Comput. Stud. 2001, 55, 661–674. [CrossRef]
81. Fischer, G.; Sullivan, J., Jr. Human-centered public transportation systems for persons with cognitive disabilities. In Proceedings
of the Participatory Design Conference, Malmo, Sweden, 23–25 June 2002; pp. 194–198.
82. Dingus, T.A.; Hulse, M.C.; Jahns, S.K.; Alves-Foss, J.; Confer, S.; Rice, A.; Roberts, I.; Hanowski, R.J.; Sorenson, D. Development of
Human Factors Guidelines for Advanced Traveler Information Systems and Commercial Vehicle Operations: Literature Review.
1996. Available online: https://www.fhwa.dot.gov/publications/research/safety/95153/index.cfm (accessed on 3 August 2018).
83. Beul-Leusmann, S.; Samsel, C.; Wiederhold, M.; Krempels, K.H.; Jakobs, E.M.; Ziefle, M. Usability evaluation of mobile passenger
information systems. In Proceedings of the International Conference of Design, User Experience, and Usability, Heraklion,
Greece, 22–27 June 2014; pp. 217–228.
84. Mazloumi, E.; Rose, G.; Currie, G.; Moridpour, S. Prediction intervals to account for uncertainties in neural network predictions:
Methodology and application in bus travel time prediction. Eng. Appl. Artif. Intell. 2011, 24, 534–542. [CrossRef]
85. Van Hinsbergen, C.I.; Van Lint, J.; Van Zuylen, H. Bayesian committee of neural networks to predict travel times with confidence
intervals. Transp. Res. Part C Emerg. Technol. 2009, 17, 498–509. [CrossRef]
86. Liu, Y.; Liu, Z.; Li, X.; Huang, W.; Wei, Y.; Cao, J.; Guo, J. Dynamic traffic demand uncertainty prediction using radio-frequency
identification data and link volume data. IET Intell. Transp. Syst. 2019, 13, 1309–1317. [CrossRef]
87. Tsekeris, T.; Stathopoulos, A. Short-term prediction of urban traffic variability: Stochastic volatility modeling approach. J. Transp.
Eng. 2009, 136, 606–613. [CrossRef]
88. Khosravi, A.; Mazloumi, E.; Nahavandi, S.; Creighton, D.; Van Lint, J. Prediction intervals to account for uncertainties in travel
time prediction. IEEE Trans. Intell. Transp. Syst. 2011, 12, 537–547. [CrossRef]
89. Gunning, D. Explainable artificial intelligence (xai). In Defense Advanced Research Projects Agency (DARPA), nd Web; 2017; Available
online: https://www.cc.gatech.edu/~alanwags/DLAI2016/(Gunning)%20IJCAI-16%20DLAI%20WS.pdf (accessed on 4 April 2018).
90. Barredo Arrieta, A.; Díaz-Rodríguez, N.; Del Ser, J.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.;
Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward
responsible AI. Inf. Fusion 2020, 58, 82–115. [CrossRef]
91. Samek, W.; Wiegand, T.; Müller, K.R. Explainable artificial intelligence: Understanding, visualizing and interpreting deep
learning models. arXiv 2017, arXiv:1708.08296.
92. Ras, G.; van Gerven, M.; Haselager, P. Explanation methods in deep learning: Users, values, concerns and challenges. In Explainable
and Interpretable Models in Computer Vision and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2018; pp. 19–36.
93. Vlahogianni, E.I.; Karlaftis, M.G.; Orfanou, F.P. Modeling the effects of weather and traffic on the risk of secondary incidents.
J. Intell. Transp. Syst. 2012, 16, 109–117. [CrossRef]
Sensors 2021, 21, 1121 30 of 34

94. Thiagarajan, A.; Ravindranath, L.; LaCurts, K.; Madden, S.; Balakrishnan, H.; Toledo, S.; Eriksson, J. VTrack: Accurate, energy-
aware road traffic delay estimation using mobile phones. In Proceedings of the 7th ACM Conference on Embedded Networked
Sensor Systems, Berkeley, CA, USA, 4–6 November 2009; pp. 85–98.
95. Thiagarajan, A. Probabilistic Models for Mobile Phone Trajectory Estimation. Ph.D. Thesis, Massachusetts Institute of Technology,
Cambridge, MA, USA, 2011.
96. Geisler, S.; Quix, C.; Schiffer, S.; Jarke, M. An evaluation framework for traffic information systems based on data streams. Transp.
Res. Part C Emerg. Technol. 2012, 23, 29–55. [CrossRef]
97. Gama, J.; Žliobaitė, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A. A survey on concept drift adaptation. ACM Comput. Surv.
(CSUR) 2014, 46, 44. [CrossRef]
98. Žliobaitė, I.; Pechenizkiy, M.; Gama, J. An overview of concept drift applications. In Big Data Analysis: New Algorithms for a New
Society; Springer: Berlin/Heidelberg, Germany, 2016; pp. 91–114.
99. Delany, S.J.; Cunningham, P.; Tsymbal, A.; Coyle, L. A case-based technique for tracking concept drift in spam filtering.
In Applications and Innovations in Intelligent Systems XII; Springer: Berlin/Heidelberg, Germany, 2005; pp. 3–16.
100. Méndez, J.R.; Fdez-Riverola, F.; Iglesias, E.L.; Díaz, F.; Corchado, J.M. Tracking concept drift at feature selection stage in
spamhunting: An anti-spam instance-based reasoning system. In Proceedings of the European Conference on Case-Based
Reasoning, Fethiye, Turkey, 4–7 September 2006; pp. 504–518.
101. Stiglic, G.; Kokol, P. Interpretability of sudden concept drift in medical informatics domain. In Proceedings of the 2011 IEEE 11th
International Conference on Data Mining Workshops (ICDMW), Vancouver, BC, Canada, 11–14 December 2011; pp. 609–613.
102. Moreira-Matias, L.; Mendes-Moreira, J.; Gama, J.; Ferreira, M. On Improving Operational Planning and Control in Public
Transportation Networks Using Streaming Data: A Machine Learning Approach. Ph.D. Thesis, Porto University, Porto, Porgutal, 2014.
103. Laña, I.; Del Ser, J.; Padró, A.; Vélez, M.; Casanova-Mateo, C. The role of local urban traffic and meteorological conditions in air
pollution: A data-based case study in Madrid, Spain. Atmos. Environ. 2016, 145, 424–438. [CrossRef]
104. Moreira-Matias, L.; Alesiani, F. Drift3flow: Freeway-incident prediction using real-time learning. In Proceedings of the 2015
IEEE 18th International Conference on Intelligent Transportation Systems (ITSC), Gran Canaria, Spain, 15–18 September 2015;
pp. 566–571.
105. Osekowska, E.; Johnson, H.; Carlsson, B. Maritime vessel traffic modeling in the context of concept drift. Transp. Res. Procedia
2017, 25, 1457–1476. [CrossRef]
106. Wibisono, A.; Jatmiko, W.; Wisesa, H.A.; Hardjono, B.; Mursanto, P. Traffic big data prediction and visualization using fast
incremental model trees-drift detection (FIMT-DD). Knowl. Based Syst. 2016, 93, 33–46. [CrossRef]
107. Wu, T.; Xie, K.; Xinpin, D.; Song, G. A online boosting approach for traffic flow forecasting under abnormal conditions.
In Proceedings of the 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), Chongqing, China,
29–31 May 2012; pp. 2555–2559.
108. Procopio, M.J.; Mulligan, J.; Grudic, G. Learning terrain segmentation with classifier ensembles for autonomous robot navigation
in unstructured environments. J. Field Robot. 2009, 26, 145–175. [CrossRef]
109. Zhang, Y.; Ye, Z. Short-term traffic flow forecasting using fuzzy logic system methods. J. Intell. Transp. Syst. 2008, 12, 102–112.
[CrossRef]
110. Isaacson, D.; Robinson, J.; Swenson, H.; Denery, D. A concept for robust, high density terminal air traffic operations. In Proceed-
ings of the 10th AIAA Aviation Technology, Integration, and Operations (ATIO) Conference, Fort Worth, TX, USA, 13–15 Septem-
ber 2010; p. 9292.
111. Chen, J.; Chen, L.; Sun, D. Air traffic flow management under uncertainty using chance-constrained optimization. Transp. Res.
Part B Methodol. 2017, 102, 124–141. [CrossRef]
112. Wechsler, S.P.; Ban, H.; Li, L. The Pervasive Challenge of Error and Uncertainty in Geospatial Data. In Geospatial Challenges in the
21st Century; Springer: Berlin/Heidelberg, Germany, 2019; pp. 315–332.
113. Adar, E.; Re, C. Managing uncertainty in social networks. IEEE Data Eng. Bull. 2007, 30, 15–22.
114. De Lara, M. A mathematical framework for resilience: Dynamics, strategies, shocks and acceptable paths. arXiv 2017,
arXiv:1709.01389.
115. Colpaert, P.; Ballieu, S.; Verborgh, R.; Mannens, E. The Impact of an Extra Feature on the Scalability of Linked Connections.
In Proceedings of the 7th International Workshop on Consuming Linked Data co-located with 15th International Semantic Web
Conference, COLD@ISWC 2016, Kobe, Japan, 18 October 2016; p. 10.
116. Basu, A.; Vanajakshi, L. Genetic Algorithm Based Dynamic Route Planner for Public Transport 2. Transport 2016, 2, 3.
117. Schmitt, F.; Schulte, A. Experimental evaluation of a scalable mixed-initiative planning associate for future military helicopter
missions. In Proceedings of the International Conference on Engineering Psychology and Cognitive Ergonomics, Las Vegas, NV,
USA, 15–20 July 2018; pp. 649–663.
118. Zhang, W.; Zhu, B.; Zhang, L.; Yuan, J.; You, I. Exploring urban dynamics based on pervasive sensing: Correlation analysis of
traffic density and air quality. In Proceedings of the 2012 Sixth International Conference on Innovative Mobile and Internet
Services in Ubiquitous Computing, Palermo, Italy, 4–6 July 2012; pp. 9–16.
119. Zhang, Z.; Ni, M.; He, Q.; Gao, J.; Gou, J.; Li, X. Exploratory study on correlation between Twitter concentration and traffic surges.
Transp. Res. Rec. 2016, 2553, 90–98. [CrossRef]
Sensors 2021, 21, 1121 31 of 34

120. Zhang, S.; Wu, G.; Costeira, J.P.; Moura, J.M. Understanding traffic density from large-scale web camera data. In Proceedings of
the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5898–5907.
121. Kerner, B.S. Three-phase traffic theory and highway capacity. Phys. A Stat. Mech. Its Appl. 2004, 333, 379–440. [CrossRef]
122. Kerner, B.S. The physics of traffic. Phys. World 1999, 12, 25. [CrossRef]
123. Karpatne, A.; Atluri, G.; Faghmous, J.H.; Steinbach, M.; Banerjee, A.; Ganguly, A.; Shekhar, S.; Samatova, N.; Kumar, V. Theory-
guided data science: A new paradigm for scientific discovery from data. IEEE Trans. Knowl. Data Eng. 2017, 29, 2318–2331.
[CrossRef]
124. Vlahogianni, E.I.; Karlaftis, M.G.; Golias, J.C. Spatio-temporal short-term urban traffic volume forecasting using genetically
optimized modular networks. Comput. Aided Civ. Infrastruct. Eng. 2007, 22, 317–325. [CrossRef]
125. Ramezani, M.; Geroliminis, N. On the estimation of arterial route travel time distribution with Markov chains. Transp. Res. Part B
Methodol. 2012, 46, 1576–1590. [CrossRef]
126. Chen, B.; Cheng, H.H. A review of the applications of agent technology in traffic and transportation systems. IEEE Trans. Intell.
Transp. Syst. 2010, 11, 485–497. [CrossRef]
127. Shahrbabaki, M.R.; Safavi, A.A.; Papageorgiou, M.; Papamichail, I. A data fusion approach for real-time traffic state estimation in
urban signalized links. Transp. Res. Part C Emerg. Technol. 2018, 92, 525–548. [CrossRef]
128. Mintsis, E.; Vlahogianni, E.I.; Mitsakis, E.; Ozkul, S. Evaluation of a cooperative speed advice service implemented along an
urban arterial corridor. In Proceedings of the 2017 5th IEEE International Conference on Models and Technologies for Intelligent
Transportation Systems (MT-ITS), Naples, Italy, 26–28 June 2017; pp. 232–237.
129. Gupta, A.; Ong, Y.S. Memetic Computation: The Mainspring of Knowledge Transfer in a Data-Driven Optimization Era; Springer:
Berlin/Heidelberg, Germany, 2018; Volume 21.
130. Neri, F.; Cotta, C. Memetic algorithms and memetic computing optimization: A literature review. Swarm Evol. Comput. 2012,
2, 1–14. [CrossRef]
131. Gonzalez, M.C.; Hidalgo, C.A.; Barabasi, A.L. Understanding individual human mobility patterns. Nature 2008, 453, 779.
[CrossRef]
132. Chatterjee, S.; Mitra, B.; Chakraborty, S. Type2Motion: Detecting Mobility Context from Smartphone Typing. In Proceedings of
the 24th Annual International Conference on Mobile Computing and Networking, New Delhi, India, 29 October–2 November
2018; pp. 753–755.
133. Tselentis, D.I.; Yannis, G.; Vlahogianni, E.I. Innovative motor insurance schemes: A review of current practices and emerging
challenges. Accid. Anal. Prev. 2017, 98, 139–148. [CrossRef]
134. Andreasson, H.; Bouguerra, A.; Cirillo, M.; Dimitrov, D.N.; Driankov, D.; Karlsson, L.; Lilienthal, A.J.; Pecora, F.; Saarinen, J.P.;
Sherikov, A.; et al. Autonomous transport vehicles: Where we are and what is missing. IEEE Robot. Autom. Mag. 2015, 22, 64–75.
[CrossRef]
135. Kulik, A.; Dergachev, K. Intelligent transport systems in aerospace engineering. In Intelligent Transportation Systems–Problems and
Perspectives; Springer: Berlin/Heidelberg, Germany, 2016; pp. 243–303.
136. Barmpounakis, E.N.; Vlahogianni, E.I.; Golias, J.C. Unmanned Aerial Aircraft Systems for transportation engineering: Current
practice and future challenges. Int. J. Transp. Sci. Technol. 2016, 5, 111–122. [CrossRef]
137. Lin, M.; Hsu, W.J. Mining GPS data for mobility patterns: A survey. Pervasive Mob. Comput. 2014, 12, 1–16. [CrossRef]
138. Khodaei, M.; Papadimitratos, P. The key to intelligent transportation: Identity and credential management in vehicular
communication systems. IEEE Veh. Technol. Mag. 2015, 10, 63–69. [CrossRef]
139. Menouar, H.; Guvenc, I.; Akkaya, K.; Uluagac, A.S.; Kadri, A.; Tuncer, A. UAV-enabled intelligent transportation systems for the
smart city: Applications and challenges. IEEE Commun. Mag. 2017, 55, 22–28. [CrossRef]
140. Sucasas, V.; Mantas, G.; Saghezchi, F.B.; Radwan, A.; Rodriguez, J. An autonomous privacy-preserving authentication scheme for
intelligent transportation systems. Comput. Secur. 2016, 60, 193–205. [CrossRef]
141. Yuan, Y.; Wang, F.Y. Towards blockchain-based intelligent transportation systems. In Proceedings of the 2016 IEEE 19th
International Conference on Intelligent Transportation Systems (ITSC), Rio de Janeiro, Brazil, 1–4 November 2016; pp. 2663–2668.
142. Lei, A.; Cruickshank, H.; Cao, Y.; Asuquo, P.; Ogah, C.P.A.; Sun, Z. Blockchain-based dynamic key management for heterogeneous
intelligent transportation systems. IEEE Internet Things J. 2017, 4, 1832–1843. [CrossRef]
143. Zheng, X.; Chen, W.; Wang, P.; Shen, D.; Chen, S.; Wang, X.; Zhang, Q.; Yang, L. Big data for social transportation. IEEE Trans.
Intell. Transp. Syst. 2015, 17, 620–630. [CrossRef]
144. He, J.; Shen, W.; Divakaruni, P.; Wynter, L.; Lawrence, R. Improving traffic prediction with tweet semantics. In Proceedings of the
Twenty-Third International Joint Conference on Artificial Intelligence, Beijing, China, 3–9 August 2013.
145. Ni, M.; He, Q.; Gao, J. Forecasting the subway passenger flow under event occurrences with social media. IEEE Trans. Intell.
Transp. Syst. 2016, 18, 1623–1632. [CrossRef]
146. Evans-Cowley, J.S.; Griffin, G. Microparticipation with social media for community engagement in transportation planning.
Transp. Res. Rec. 2012, 2307, 90–98. [CrossRef]
147. Kuflik, T.; Minkov, E.; Nocera, S.; Grant-Muller, S.; Gal-Tzur, A.; Shoor, I. Automating a framework to extract and analyse
transport related social media content: The potential and the challenges. Transp. Res. Part C Emerg. Technol. 2017, 77, 275–291.
[CrossRef]
Sensors 2021, 21, 1121 32 of 34

148. Woodcock, J.; Edwards, P.; Tonne, C.; Armstrong, B.G.; Ashiru, O.; Banister, D.; Beevers, S.; Chalabi, Z.; Chowdhury, Z.;
Cohen, A.; et al. Public health benefits of strategies to reduce greenhouse-gas emissions: Urban land transport. Lancet 2009,
374, 1930–1943. [CrossRef]
149. Chen, Z.; Xia, J.C.; Irawan, B. Development of fuzzy logic forecast models for location-based parking finding services. Math.
Probl. Eng. 2013, 2013, 473471. [CrossRef]
150. Kramers, A. Designing next generation multimodal traveler information systems to support sustainability-oriented decisions.
Environ. Model. Softw. 2014, 56, 83–93. [CrossRef]
151. Zhang, S.; Lee, C.K.; Chan, H.K.; Choy, K.L.; Wu, Z. Swarm intelligence applied in green logistics: A literature review. Eng. Appl.
Artif. Intell. 2015, 37, 154–169. [CrossRef]
152. Feigon, S.; Murphy, C. Shared Mobility and the Transformation of Public Transit; TRID: Washington, DC, USA, 2016.
153. Vlahogianni, E.I.; Barmpounakis, E.N. Driving analytics using smartphones: Algorithms, comparisons and challenges. Transp.
Res. Part C Emerg. Technol. 2017, 79, 196–206. [CrossRef]
154. Huang, Y.; Ng, E.C.; Zhou, J.L.; Surawski, N.C.; Chan, E.F.; Hong, G. Eco-driving technology for sustainable road transport:
A review. Renew. Sustain. Energy Rev. 2018, 93, 596–609. [CrossRef]
155. Adamidis, F.K.; Mantouka, E.G.; Barmpounakis, E.N.; Vlahogianni, E.I. Impacts of Eco Driving on Traffic Flow and Emissions in Large
Scale Urban Networks; Technical Report; TRID: Washington, DC, USA, 2019.
156. Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2009, 22, 1345–1359. [CrossRef]
157. Bajwa, S.I.; Chung, E.; Kuwahara, M. Performance evaluation of an adaptive travel time prediction model. In Proceedings of the
Intelligent Transportation Systems, Vienna, Austria, 13–16 September 2005; pp. 1000–1005.
158. Getachew, A.; Obrien, E.J. Simplified site-specific traffic load models for bridge assessment. Struct. Infrastruct. Eng. 2007,
3, 303–311. [CrossRef]
159. Habibzadeh, H.; Boggio-Dandry, A.; Qin, Z.; Soyata, T.; Kantarci, B.; Mouftah, H.T. Soft sensing in smart cities: Handling 3vs
using recommender systems, machine intelligence, and data analytics. IEEE Commun. Mag. 2018, 56, 78–86. [CrossRef]
160. Shew, C.; Pande, A.; Nuworsoo, C. Transferability and robustness of real-time freeway crash risk assessment. J. Saf. Res. 2013,
46, 83–90. [CrossRef] [PubMed]
161. Xu, C.; Wang, W.; Liu, P.; Guo, R.; Li, Z. Using the Bayesian updating approach to improve the spatial and temporal transferability
of real-time crash risk prediction models. Transp. Res. Part C Emerg. Technol. 2014, 38, 167–176. [CrossRef]
162. Ibisch, A.; Stümper, S.; Altinger, H.; Neuhausen, M.; Tschentscher, M.; Schlipsing, M.; Salinen, J.; Knoll, A. Towards autonomous
driving in a parking garage: Vehicle localization and tracking using environment-embedded lidar sensors. In Proceedings of the
2013 IEEE Intelligent Vehicles Symposium (IV), Gold Coast City, Australia, 23–26 June 2013; pp. 829–834.
163. Shen, M.; Habibi, G.; How, J.P. Transferable Pedestrian Motion Prediction Models at Intersections. arXiv 2018, arXiv:1804.00495.
164. Smirnov, A.; Levashova, T. Knowledge fusion patterns: A survey. Inf. Fusion 2019, 52, 31–40. [CrossRef]
165. Wang, P.; Yang, L.T.; Li, J.; Chen, J.; Hu, S. Data fusion in cyber-physical-social systems: State-of-the-art and perspectives.
Inf. Fusion 2019, 51, 42–57. [CrossRef]
166. Khaleghi, B.; Khamis, A.; Karray, F.O.; Razavi, S.N. Multisensor data fusion: A review of the state-of-the-art. Inf. Fusion 2013,
14, 28–44. [CrossRef]
167. Ramachandran, U.; Kumar, R.; Wolenetz, M.; Cooper, B.; Agarwalla, B.; Shin, J.; Hutto, P.; Paul, A. Dynamic data fusion for future
sensor networks. ACM Trans. Sens. Networks (TOSN) 2006, 2, 404–443. [CrossRef]
168. Nguyen, T.T.; Yang, S.; Branke, J. Evolutionary dynamic optimization: A survey of the state of the art. Swarm Evol. Comput. 2012,
6, 1–24. [CrossRef]
169. Mavrovouniotis, M.; Li, C.; Yang, S. A survey of swarm intelligence for dynamic optimization: Algorithms and applications.
Swarm Evol. Comput. 2017, 33, 1–17. [CrossRef]
170. Chang, W.C.; Cho, C.W. Online boosting for vehicle detection. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 2009, 40, 892–902.
[CrossRef] [PubMed]
171. Saadallah, A.; Moreira-Matias, L.; Sousa, R.; Khiari, J.; Jenelius, E.; Gama, J. BRIGHT-Drift-Aware Demand Predictions for Taxi
Networks. IEEE Trans. Knowl. Data Eng. 2019, in press. [CrossRef]
172. Moreira-Matias, L.; Gama, J.; Ferreira, M.; Mendes-Moreira, J.; Damas, L. Predicting taxi–passenger demand using streaming
data. IEEE Trans. Intell. Transp. Syst. 2013, 14, 1393–1402. [CrossRef]
173. Sun, S.; Shi, H.; Wu, Y. A survey of multi-source domain adaptation. Inf. Fusion 2015, 24, 84–92. [CrossRef]
174. Ou, C.; Ouali, C.; Karray, F. Transfer Learning Based Strategy for Improving Driver Distraction Recognition. In Proceedings
of the International Conference Image Analysis and Recognition, Póvoa de Varzim, Portugal, 27–29 June 2018; pp. 443–452.
175. Xing, Y.; Lv, C.; Wang, H.; Cao, D.; Velenis, E.; Wang, F.Y. Driver activity recognition for intelligent vehicles: A deep learning
approach. IEEE Trans. Veh. Technol. 2019, 68, 5379–5390. [CrossRef]
176. Xing, Y.; Tang, J.; Liu, H.; Lv, C.; Cao, D.; Velenis, E.; Wang, F.Y. End-to-End Driving Activities and Secondary Tasks Recognition
Using Deep Convolutional Neural Network and Transfer Learning. In Proceedings of the 2018 IEEE Intelligent Vehicles
Symposium (IV), Changshu, China, 26 June–1 July 2018; pp. 1626–1631.
177. Ye, H.; Liang, L.; Li, G.Y.; Kim, J.; Lu, L.; Wu, M. Machine learning for vehicular networks: Recent advances and application
examples. IEEE Veh. Technol. Mag. 2018, 13, 94–101. [CrossRef]
Sensors 2021, 21, 1121 33 of 34

178. Konečnỳ, J.; McMahan, H.B.; Ramage, D.; Richtárik, P. Federated optimization: Distributed machine learning for on-device
intelligence. arXiv 2016, arXiv:1610.02527.
179. McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-Efficient Learning of Deep Networks from
Decentralized Data. In Proceedings of the Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017;
pp. 1273–1282.
180. Ferdowsi, A.; Challita, U.; Saad, W. Deep Learning for Reliable Mobile Edge Analytics in Intelligent Transportation Systems: An
Overview. IEEE Veh. Technol. Mag. 2019, 14, 62–70. [CrossRef]
181. Vögel, H.J.; Süß, C.; Hubregtsen, T.; André, E.; Schuller, B.; Härri, J.; Conradt, J.; Adi, A.; Zadorojniy, A.; Terken, J.; et al.
Emotion-awareness for intelligent vehicle assistants: A research agenda. In Proceedings of the IEEE/ACM 1st International
Workshop on Software Engineering for AI in Autonomous Systems (SEFAIAS), Gothenburg, Sweden, 28 May 2018; pp. 11–15.
182. Kroll, A. Grey-box models: Concepts and application. New Front. Comput. Intell. Its Appl. 2000, 57, 42–51.
183. Oussar, Y.; Dreyfus, G. How to be a gray box: Dynamic semi-physical modeling. Neural Netw. 2001, 14, 1161–1172. [CrossRef]
184. Inga, J.; Flad, M.; Diehm, G.; Hohmann, S. Gray-Box Driver Modeling and Prediction: Benefits of Steering Primitives. In Proceed-
ings of the 2015 IEEE International Conference on Systems, Man, and Cybernetics, Kowloon Tong, Hong Kong, 9–12 October
2015; pp. 3054–3059.
185. Flad, M.; Fröhlich, L.; Hohmann, S. Cooperative shared control driver assistance systems based on motion primitives and
differential games. IEEE Trans. Hum. Mach. Syst. 2017, 47, 711–722. [CrossRef]
186. Mittal, S. A survey of techniques for approximate computing. ACM Comput. Surv. (CSUR) 2016, 48, 62. [CrossRef]
187. Alwadi, M.; Chetty, G. Energy Efficient Data Mining Scheme for High Dimensional Data. Procedia Comput. Sci. 2015, 46, 483–490.
[CrossRef]
188. Han, J.; Orshansky, M. Approximate computing: An emerging paradigm for energy-efficient design. In Proceedings of the 2013
18th IEEE European Test Symposium (ETS), Avignon, France, 27–30 May 2013; pp. 1–6.
189. Lane, N.D.; Georgiev, P. Can deep learning revolutionize mobile sensing? In Proceedings of the 16th International Workshop on
Mobile Computing Systems and Applications, Santa Fe, NM, USA, 12–13 February 2015; pp. 117–122.
190. Faisal, S. Towards Energy Efficient Data Mining & Graph Processing. Ph.D. Thesis, The Ohio State University, Columbus, OH,
USA, 2015.
191. Zliobaite, I.; Hollmen, J.; Koskinen, L.; Teittinen, J. Towards hardware-driven design of low-energy algorithms for data analysis.
ACM SIGMOD Rec. 2015, 43, 15–20. [CrossRef]
192. Arrieta, A.B.; Díaz-Rodríguez, N.; Ser, J.D.; Bennetot, A.; Tabik, S.; Barbado, A.; García, S.; Gil-López, S.; Molina, D.;
Benjamins, R.; et al. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward
Responsible AI. arXiv 2019, arXiv:1910.10045.
193. Martin, K. Ethical implications and accountability of algorithms. J. Bus. Ethics 2018, 160, 835–850. [CrossRef]
194. Veale, M.; Binns, R. Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data. Big
Data Soc. 2017, 4, 2053951717743530. [CrossRef]
195. Stoyanovich, J.; Howe, B.; Abiteboul, S.; Miklau, G.; Sahuguet, A.; Weikum, G. Fides: Towards a platform for responsible data
science. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management, Chicago, IL,
USA, 27–29 June 2017; p. 26.
196. Whittaker, M.; Crawford, K.; Dobbe, R.; Fried, G.; Kaziunas, E.; Mathur, V.; West, S.M.; Richardson, R.; Schultz, J.; Schwartz, O.
AI Now Report 2018; AI Now Institute at New York University: New York, NY, USA, 2018.
197. Victor, N.; Lopez, D.; Abawajy, J.H. Privacy models for big data: A survey. Int. J. Big Data Intell. 2016, 3, 61–75. [CrossRef]
198. Rashidi, T.H.; Abbasi, A.; Maghrebi, M.; Hasan, S.; Waller, T.S. Exploring the capacity of social media data for modelling travel
behaviour: Opportunities and challenges. Transp. Res. Part C Emerg. Technol. 2017, 75, 197–211. [CrossRef]
199. Boyd, D.; Crawford, K. Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon.
Information, Commun. Soc. 2012, 15, 662–679. [CrossRef]
200. Chen, Y.; Guizani, M.; Zhang, Y.; Wang, L.; Crespi, N.; Lee, G.M. When traffic flow prediction meets wireless big data analytics.
arXiv 2017, arXiv:1709.08024.
201. Wilson, B.; Hoffman, J.; Morgenstern, J. Predictive inequity in object detection. arXiv 2019, arXiv:1902.11097.
202. Lim, H.S.M.; Taeihagh, A. Algorithmic decision-making in AVs: Understanding ethical and technical concerns for smart cities.
Sustainability 2019, 11, 5791. [CrossRef]
203. Bigman, Y.E.; Gray, K. Life and death decisions of autonomous vehicles. Nature 2020, 579, E1–E2. [CrossRef]
204. Fu, K.; Alhamadani, A.; Ji, T.; Lu, C.T. Batman or the joker? the powerful urban computing and its ethics issues. SIGSPATIAL
Spec. 2019, 11, 16–25. [CrossRef]
205. Leben, D. Normative Principles for Evaluating Fairness in Machine Learning. In Proceedings of the AAAI/ACM Conference on
AI, Ethics, and Society, New York, NY, USA, 7–8 February 2020; pp. 86–92.
206. Verma, S.; Rubin, J. Fairness definitions explained. In Proceedings of the IEEE/ACM International Workshop on Software
Fairness (FairWare), Gothenburg, Sweden, 29 May 2018; pp. 1–7.
207. Zook, M.; Barocas, S.; Crawford, K.; Keller, E.; Gangadharan, S.P.; Goodman, A.; Hollander, R.; Koenig, B.A.; Metcalf, J.;
Narayanan, A.; et al. Ten Simple Rules for Responsible Big Data Research. 2017. Available online: https://journals.plos.org/
ploscompbiol/article?id=10.1371/journal.pcbi.1005399 (accessed on 5 January 2019).
Sensors 2021, 21, 1121 34 of 34

208. Fei-Fei, L.; Fergus, R.; Perona, P. One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 2006, 28, 594–611.
[CrossRef]
209. Wang, Y.; Yao, Q.; Kwok, J.; Ni, L.M. Generalizing from a Few Examples: A Survey on Few-Shot Learning. arXiv 2019,
arXiv:1904.05046.
210. Krawczyk, B. Learning from imbalanced data: open challenges and future directions. Prog. Artif. Intell. 2016, 5, 221–232.
[CrossRef]
211. Branco, P.; Torgo, L.; Ribeiro, R.P. A survey of predictive modeling on imbalanced domains. ACM Comput. Surv. (CSUR) 2016,
49, 31. [CrossRef]
212. Fernandez, A.; Herrera, F.; Cordon, O.; del Jesus, M.J.; Marcelloni, F. Evolutionary Fuzzy Systems for Explainable Artificial
Intelligence: Why, When, What for, and Where to? IEEE Comput. Intell. Mag. 2019, 14, 69–81. [CrossRef]
213. Mencar, C.; Alonso, J.M. Paving the Way to Explainable Artificial Intelligence with Fuzzy Modeling. In Proceedings of the
International Workshop on Fuzzy Logic and Applications, Genoa, Italy, 6–7 September 2018; pp. 215–227.
214. Li, Y. Deep Reinforcement Learning: An Overview. arXiv 2017, arXiv:1701.07274.
215. Nisan, N.; Roughgarden, T.; Tardos, E.; Vazirani, V.V. Algorithmic Game Theory; Cambridge University Press: Cambridge, UK,
2007.
216. Sallab, A.E.; Abdou, M.; Perot, E.; Yogamani, S. Deep reinforcement learning framework for autonomous driving. Electron.
Imaging 2017, 2017, 70–76. [CrossRef]
217. Ruch, C.; Richards, S.; Frazzoli, E. The Value of Coordination in One-Way Mobility-on-Demand Systems. IEEE Trans. Netw. Sci.
Eng. 2019, in press. [CrossRef]
218. Aldeen, Y.A.A.S.; Salleh, M.; Razzaque, M.A. A comprehensive review on privacy preserving data mining. SpringerPlus 2015,
4, 694. [CrossRef]
219. Agrawal, R.; Srikant, R. Privacy-preserving data mining. In Proceedings of the ACM SIGMOD International Conference on
Management of Data, Dallas, TX, USA, 16–18 May 2000; Volume 29, pp. 439–450.
220. Mendes, R.; Vilela, J.P. Privacy-preserving data mining: methods, metrics, and applications. IEEE Access 2017, 5, 10562–10582.
[CrossRef]
221. Ding, W.; Jing, X.; Yan, Z.; Yang, L.T. A survey on data fusion in internet of things: Towards secure and privacy-preserving fusion.
Inf. Fusion 2019, 51, 129–144. [CrossRef]
222. Zhou, Y.; Chen, S.; Mo, Z.; Yin, Y. Privacy preserving origin-destination flow measurement in vehicular cyber-physical systems.
In Proceedings of the 2013 IEEE 1st International Conference on Cyber-Physical Systems, Networks, and Applications (CPSNA),
Taipei, Taiwan, 19–20 August 2013; pp. 32–37.
223. Florian, M.; Finster, S.; Baumgart, I. Privacy-preserving cooperative route planning. IEEE Internet Things J. 2014, 1, 590–599.
[CrossRef]
224. Rabieh, K.; Mahmoud, M.M.; Younis, M. Privacy-preserving route reporting scheme for traffic management in VANETs.
In Proceedings of the 2015 IEEE International Conference on Communications (ICC), London, UK, 8–12 June 2015; pp. 7286–7291.
225. Kim, S.W.; Park, S.; Won, J.I.; Kim, S.W. Privacy preserving data mining of sequential patterns for network traffic data. Inf. Sci.
2008, 178, 694–713. [CrossRef]
226. Huang, L.; Joseph, A.D.; Nelson, B.; Rubinstein, B.I.; Tygar, J. Adversarial machine learning. In Proceedings of the 4th ACM
Workshop on Security and Artificial Intelligence, Chicago, IL, USA, 21 October 2011; pp. 43–58.
227. Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks.
arXiv 2013, arXiv:1312.6199.
228. Akhtar, N.; Mian, A. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access 2018, 6, 14410–14430.
[CrossRef]
229. Nguyen, G.; Dlugolinsky, S.; Bobák, M.; Tran, V.; García, Á.L.; Heredia, I.; Malík, P.; Hluchỳ, L. Machine Learning and Deep
Learning frameworks and libraries for large-scale data mining: A survey. Artif. Intell. Rev. 2019, 52, 77–124 [CrossRef]
230. Hutter, F.; Kotthoff, L.; Vanschoren, J. Automated Machine Learning: Methods, Systems, Challenges; Springer Nature:
Berlin/Heidelberg, Germany, 2019.

You might also like