Iotac 1
Iotac 1
Iotac 1
Abstract—Critical everyday activities handled by modern IoT the overall architecture may be. Hence, to ensure a secure IoT
Systems imply that security is of major concern both for the end- System, the security level of the software running on its nodes
users and the industry. Securing the IoT System Architecture is should be assessed and optimized throughout its development.
commonly used to strengthen its resilience to malicious attacks.
However, the security of software running on the IoT must be To this end, we develop the Software Security-by-design
considered as well, since the exploitation of its vulnerabilities (SSD) Platform, i.e., a novel software security monitoring
can infringe the security of the overall system, regardless of how and optimization platform that provides mechanisms for as-
secure its architecture may be. Thus, we present an IoT Software sessing and improving the security of IoT software applica-
Security-by-Design (SSD) Platform, which provides mechanisms tions, throughout their overall Software Development Lifecy-
for monitoring and optimizing the security of IoT software
applications throughout their development lifecycle, to validate cle (SDLC). In particular, SSD allows the developers of an
the broader security of the IoT software. This paper describes IoT software application to (i) ensure the correct definition
the proposed SSD platform that leverages security information of the security requirements, (ii) ensure the adherence of the
from all phases of development, using some novel mechanisms produced IoT software application to the originally defined
that have been implemented, and which can lead to a holistic requirements, (iii) evaluate the security level of the source code
security evaluation and future security certification.
the IoT software application, and (iv) provide recommenda-
Keywords—Internet of Things, Software Security, Requirements
tions for security improvements. In that way, the SSD Platform
Engineering, Static Analysis, Vulnerability Prediction
provides a more holistic software security assessment, as it
covers all the phases of the SDLC horizontally.
I. I NTRODUCTION
The purpose of the present paper is to present the overview
Modern Internet of Things (IoT) Systems consist of a large of the envisaged SSD Platform and describe its main func-
number of interconnected and highly diverse devices, such tionalities, i.e., the main novel mechanisms that have been
as sensors, actuators, gateways, etc., often accessible and proposed and developed so far. The SSD Platform is one of
controllable through the Internet. The high interconnectivity the main outcomes of the IoTAC Project, an EU Project funded
and accessibility of modern IoT Systems, along with the through the Horizon2020 Programme.
criticality of the daily activities that they monitor and control In the sequel, Section 2 discusses the related work focusing
(e.g. smart living, autonomous driving, industrial control, etc.) on the main challenges that we try to address. Section 3
render their security an aspect of utmost concern, both for the provides an overview of the broader SSD Platform, whereas
users and the owning enterprises [1]. Section 3 describes the main novel mechanisms that have been
An effective way of securing an IoT System is by securing developed so far. Finally, Section 5 concludes the paper and
its architecture. This can be achieved through conformance discusses directions for future work.
to International IoT Security Standards and the deployment
of various security countermeasures, such as intelligent at- II. R ELATED W ORK
tack detection, prevention, and mitigation mechanisms, se- According to the Security-by-Design paradigm, security
curity gateways, honeypots, etc. Several initiatives have re- should be monitored and optimized at all phases of the SDLC,
cently focused on extending well-established IoT Architectures and particularly during the Requirements, Design, Coding, and
(e.g., the ISO/IEC 30141 Reference Architecture [2]) towards Testing phases.
strengthening their security (e.g., SerIoT [1]). Apart from a During the Design and Requirement phases, security can be
secure IoT Architecture however, the software that is running added by ensuring that the security requirements are correctly
on the different nodes of an IoT System should also be defined, since a large portion of software vulnerabilities stem
considered. As per the “security of the weakest link” principle, from missing, incorrect, or vague security requirements [3].
if the software contains vulnerabilities, the security of the Although several approaches for specifying, verifying, and
overall system could be compromised regardless of how secure validating functional requirements exist [4]–[6], highly limited
Fig. 3: The high-level overview of the Software Security Fig. 4: Overview of the Security Evaluation Framework
Requirements Verification and Validation mechanism
As can be seen by Figure 3, the mechanism receives as As can be seen by Figure 4, SEF receives as input a software
input the Ontology instances of a security requirement, i.e., application and employs static analysis in order to detect
its main semantic concepts (e.g., Action, Actor, Object, etc.), potential security issues (i.e., security-related static analysis
as derived by the specification mechanism presented in Section alerts) that may reside in the software. This is achieved mainly
IV-A1. Subsequently, these instances are compared to the via a popular static analysis platform, namely SonarQube3 ,
ontology instances of the carefully curated list of security which is configured in order to detect important security
requirements that are stored in the Software Security Require- vulnerabilities (e.g., SQL Injection, Cross-site Scripting, Mem-
ments Knowledge Base. In particular, similarity checks are ory Leaks, Weak Cryptography, etc.). Additional open-source
performed between the user-defined requirement and those of static code analyzers are also utilized (e.g., CppCheck and
the curated list, utilizing popular NLP toolkits (e.g., WordNet FindBugs) through dedicated SonarQube plugins.
with NLTK2 ). Based on the values of the calculated similarity Subsequently, the low-level static analysis alerts are fed
scores, several recommendations for improvement are pro- to the Security Measures Computation mechanism, which
vided, such as: (i) rephrasing the analyzed security requirement aggregates them to compute the high-level security measures.
based on a highly similar requirement found in the curated IoT-specific security models are utilized (leveraging concepts
list, (ii) inclusion of additional security requirements that are from state-of-the-art security and quality models [8], [9]) to
observed to be closely related to the analyzed requirement, and determine (i) the high-level security measures that should
(iii) changing the priority of the analyzed security requirement be computed, and (ii) which low-level static analysis results
based on the priority of similar requirements in the curated list. should be aggregated (and in what way) to quantify those
2 https://www.nltk.org/ 3 https://www.sonarqube.org/
measures. The output of SEF is a report containing the detailed demand, based on user feedback. In that way, the SAA adapts
results of the analysis, and particularly: (i) the calculated high- to the specific characteristics of the software product to which
level security measures, and (ii) the low-level static analysis it is applied to provide more accurate assessments.
alerts that were utilized for computing those measures. 2) Vulnerability Prediction: Vulnerability Prediction is re-
Security Alerts Criticality Assessor: Although effective in sponsible for the identification of security hotspots, i.e., soft-
detecting security issues, static analysis is known to produce ware components (e.g., classes) that are likely to contain
long lists of alerts, most of them not being critical from a se- critical vulnerabilities. For the identification of potentially vul-
curity viewpoint. This hinders its practicality, since developers nerable software components, vulnerability prediction models
often have to go through the tedious process of triaging the (VPM) are constructed, which are mainly machine learning
alerts to detect those that correspond to critical security issues. models that are built based on software attributes retrieved
In an attempt to address the aforementioned problem, primarily from the source code of the analyzed software (e.g.,
we propose a novel mechanism for assessing the criticality software metrics, text features, etc.). The results of the vul-
of security-related static analysis alerts [13]. In particular, nerability prediction models are highly useful for developers
we developed a self-adaptive technique, the Security Alerts and project managers, as they allow them to better prioritize
Criticality Assessor (SACA), for classifying and prioritizing their testing and fortification efforts by allocating limited test
security-related static analysis alerts based on their critical- resources to high-risk (i.e., potentially vulnerable) areas.
ity, by considering information retrieved from (i) the alerts a) Component-level Vulnerability Prediction: Existing
themselves, (ii) vulnerability prediction (see Section IV-B2), vulnerability prediction models focus on the intrinsic charac-
and (iii) user feedback. The proposed technique is based on teristics of the analyzed software component, and particularly
machine learning models, particularly on neural networks, on attributes of its source code, in order to judge whether it
which were built using data retrieved from static analysis is vulnerable or not. Among the different attributes that have
reports of real-world software applications. The high-level been examined in the literature, those that are derived through
overview of the tool is presented in Figure 5. text mining have demonstrated the most promising results [10],
[14]. To this end, as part of the Software Security Assurance
module of the SSD Platform, we developed vulnerability
prediction models based on deep learning, utilizing as input the
sequences of word tokens that reside in the source code of the
component, and word embedding vectors for their effective
representation [15]. A high-level overview of the proposed
models is illustrated in Figure 6.