KPIs For Cibersecurity
KPIs For Cibersecurity
KPIs For Cibersecurity
COMPUTER SECURITY
Outpost24 is one company that specializes in vulnerability management. By using their scanning
tool OUTSCAN™, Outpost24 can identify vulnerabilities in network components, such as
firewalls, switches, printers, devices, servers, workstations and other computer systems. These
results are then stored in a database. Within this study, the authors will work together with
Outpost24 towards this data. The goal is to define a process for generation of vulnerability
reports for the company. The process will perform a statistical analysis of the data and present
the findings.
To solve the task a report was created, during which the process was documented. The work
began with a background study into Key Performance Indicators (KPIs), in which the most
common security KPIs were identified from similar works. A tool was also developed to help
with the analysis. This resulted in a statistical analysis using Outpost24’s dataset. By presenting
the data formatted by the KPIs, trends could be identified. This showed an overall trend of
increasing vulnerabilities and the necessity for organizations to spend resources towards security.
The KPIs offer other possibilities, such as creating a baseline for security evaluation using data
from one year. In the future, one could use the KPIs to compare how the security situation has
changed.
i
Blank sida
ii
Sammanfattning
I dagens uppkopplade samhälle med teknologi som snabbt utvecklats så finns ett stort intresse av
att erbjuda tekniska tjänster i vår vardag. Då dessa används för att hantera känslig information
och pengar så finns även krav på säkerheten. Ifall fel uppträder i dessa system så riskeras
säkerheten för samtliga användare samt de ansvariga. Dessa system måste säkras och bibehålla
säkerheten oavsett ifall sårbarheter dyker upp.
Outpost24 är ett företag som arbetar med att identifiera och hantera sårbarheter i system. Med
hjälp av scanning verktyget OUTSCAN™ så kan Outpost24 hitta sårbarheter i nätverksenheter,
såsom brandväggar, switchar, skrivare, enheter, servrar, arbetsstationer och andra datorsystem.
Dessa lagras sedan i en databas. Under denna studie så kommer man jobba mot att skapa en
process för att generera rapporter med hjälp utav data från Outpost24. Rapporterna är tänkta att
presentera en statistisk analys av funna sårbarheter med hjälp av datan.
För att skapa rapporten så kommer en process att utvecklas. Denna kommer att tas fram genom
skapandet av en förstarapport tillsammans med de verktyg som krävs. Som grunden för
rapporten kommer en studie inom relaterade arbeten att utföras. Denna kommer att hjälpa till att
avgränsa och bestämma innehållet på rapporten. Nyckeltal kommer att identifieras, och de
vanligaste kommer att tas med till den slutliga rapporten. Resultatet var en process och verktyg
till företaget för att kunna skapa dessa rapporter regelbundet i framtiden. I samband med att den
första rapporten skapades så gjordes även en statistisk studie bland data från 2016. Där fick man
en överblick av IT-säkerhets landskapet under året samt ett par år tillbaka. Trenden var att antalet
sårbarheter ökar, och att organisationer bör spendera mer resurser på att åtgärda dessa. Genom
användning av KPI:erna så kan även ett standardvärde för säkerhetsbedömning tas fram för ett
visst år. I framtiden skulle jämförelser mot detta värde visa på hur säkerhetssituationen
förändras.
iii
Blank sida
iv
Preface
This thesis is the result of the authors’ degree project in Master of Science in Engineering:
Computer Security at Blekinge Institute of Technology in Karlskrona, Sweden.
The opportunity to work with this topic was offered by Outpost24, a security company based in
Karlskrona, Sweden. By providing the idea for the task together with the dataset necessary, they
enabled us to perform this work. Outpost24 focuses on vulnerability management and have
offices in multiple countries. Using the dataset from their automated vulnerability scanner, we
present trends and statistics found while analysing the data. The goal is to spread information
about the vulnerability landscape and utilize the data available to the company.
Acknowledgements
We would like to extend our gratitude to Martin Boldt, our supervisor during this project. By
offering his expertise, he assisted us in constantly moving forward with the work. Also, his ideas
and interest in the topic helped maintain a positive attitude throughout the project.
We would also like to thank Martin Jartelius, our contact person from Outpost24. By providing
us with all the resources required, together with insightful feedback, we could continuously work
on the project. At the same time, we thank Markus Hervén, also from Outpost24, who assisted us
with the data extraction process. Providing all the data required in a timely manner, he allowed
us to keep working with minimal delay.
We would finally like to thank the others at Outpost24 for providing us with the opportunity to
write this thesis and work with such interesting data.
v
Blank sida
vi
Nomenclature
Acronyms
4.1 Matrix of the chosen KPIs and the reports in which they were found. 18
4.2 Average severity of seen vulnerabilities. 21
4.3 Average proportion of severity levels over time. 22
4.4 Proportion of vulnerabilities by platform, split by severity. 23
4.5 Proportion of platforms within the top 10 most common. 24
4.6 Proportion of targets with vulnerabilities. 25
4.7 Percentage of vulnerable targets over time. 25
4.8 Percentage of risks remediated over time. 26
4.9 The average time in days for remediation of vulnerabilities. 27
4.10 The average time until remediation for products. 27
4.11 Average remediation times over time. 28
4.12 Vulnerability categories for the observed CVEs, based on percentage. 29
4.13 Relative vulnerability amount over time in comparison to the 2016 baseline. 31
4.14 Average CVSS score over time. 32
4.15 The top ten out of the total all found CVEs. 33
List of Tables
3.1 Search terms used for finding the commercial white papers. 9
3.2 White papers used in the background study. 10
3.3 Vulnerability information fields describing the available data. 11
3.4 The tables containing the data that were available and their contents. 13
3.5 Documents used to structure information and compile the process definition. 14
3.6 Planned sections of the report. 16
4.1 Top five most common High-severity vulnerabilities 34
4.2 Top five most common vulnerabilities that are over ten years old. 34
Blank sida
Table of Contents
ABSTRACT i
PREFACE v
NOMENCATURE
TABLE OF CONTENTS
1 INTRODUCTION 1
1.1 Introduction 1
1.2 Background 1
1.3 Objectives 2
1.4 Delimitations 2
2 THEORETICAL FRAMEWORK 4
3 METHOD 9
3.5 Validation 16
4 RESULTS 18
5 DISCUSSION 35
6 CONCLUSIONS 41
7 RECOMMENDATIONS AND FUTURE WORK 42
8 REFERENCES 43
Blank sida
1 INTRODUCTION
1.1 Introduction
The internet is today used in our everyday lives to perform a wide array of different actions. It
may be social networks, shopping or managing a bank account. This means that many
organizations require public facing systems to provide these services. The systems should allow
a set of intended actions and nothing else. For example, users should only be able to access bank
accounts belonging to them. Maintainers of these systems must therefore see that they perform
securely as intended.
Sometimes, unintended vulnerabilities can occur in systems that could affect security in different
degrees. These vulnerabilities may be because of errors during the development or installation of
a system. During 2016 alone, 6435 vulnerabilities were published to one public database [1].
However, managing these is not always trivial. The problem is that organizations are often
unaware of vulnerabilities in their system and the risks associated with them. Some may also
lack the necessary funds or expertise to remediate an issue once found. The security landscape
and their impact is not always clear.
For companies to be able to understand vulnerabilities and the risks they pose, information
regarding these should be presented. Relevant metrics that can evaluate the security levels, called
Key Performance Indicators (KPIs), can be studied and compiled. Using these, information about
the security levels could be compiled and presented to organizations. This would showcase the
importance of vulnerability management and pinpoint areas where more resources are needed.
Using charts and plots to present this data would make the information easier to digest quickly.
As the analysis would show that such problems exist, it would also motivate the need for action.
The outline of this report will begin with Chapter 1 as an introduction to the topic and describe
the end goal of this work. Chapter 2 will present some background theory together with state-of-
the-art information from both scientific and commercial works on similar subjects. Chapter 3
will detail the method used and the motivations for different choices. Results will be presented in
the following Chapter 4 regarding the KPIs, Report Generation and finally the results from the
Analysis. The discussion will take place to analyze each part of the work in Chapter 5. Finally,
chapter 6 and 7 will present general conclusions and ideas for continuing the work.
1.2 Background
The work is in cooperation with the company Outpost24 [2] that was founded in 2001 and works
with vulnerability management. Outpost24 offer services and tools to different companies with
the aim to help secure their systems and applications. Information about vulnerabilities they
encounter is stored in a database which contains information such as what software was present,
when a vulnerability was found and how long it took to remediate it. Their database carries about
43 million entries with data that originates from the European region. To get a better overview
over the security landscape, they need a report that has compiled the most important and
interesting findings about the vulnerabilities scanned. It contributes to possible comparisons in
the future to see the progress of the company over the years.
1
The data originates from their service OUTSCAN [3]. It is a scanning tool offered as a service
that detects vulnerabilities in network-connected devices. Outpost24 desires to make use of this
data that is currently not being used. This would be beneficial in two ways. The first is from a
research and development perspective. The data would be arranged in a way that are possible to
study further and draw conclusions from. The second part is a marketing perspective. Since there
would now be a process for releasing reports, this could be done regularly. The reports both
prove that the company has expertise in this area and at the same time motivates the need for
their product for customers.
1.3 Objectives
The purpose of this work is to solve an issue for Outpost24 which consist of defining a process
for generating reports that includes information about the vulnerabilities they encounter. To plan
the structure and contents of the report is the first step to research commonly used Key
Performance Indicators (KPIs) and trends related to vulnerabilities. The result of this will lead to
obtain a measure of the level of safety in the company.
By identifying the most common KPIs. and information regarding these, results that could be of
interest to organizations will be compiled. Organizations could then use the KPIs to examine
their internal vulnerability situation, and compare the results to the baseline presented in our
report. This would allow for comparisons to the overall average situation. By releasing a report,
knowledge about vulnerabilities and potential risks will also be spread, which might help
organizations plan their vulnerability management.
The process for generating a vulnerability report will be automated as much as possible, by
developing tools in Python. The data that is presented in the report will consist of data relating to
network, servers and some applications. Only the data that is necessary will be extracted from
the database for further analysis. As it originates from a real situation the data will also be fully
anonymized to respect customers’ privacy. Presenting this process together with the KPIs will
help legitimize the resulting report. It will also present new knowledge within the area, because
of an observed lack of scientific papers on the subject.
Finally, after the creation of the report, the method used will be documented in detail with the
purpose to explain the outcome of the process for generation of reports. A model will be created
that help generate future reports, which consist of the tools created. Thanks to these reports it
will be easier to identify how the security landscape changes over time. It will be possible to find
the trends over the last five years by making comparisons to older data from Outpost24 extensive
dataset.
1.4 Delimitations
The found KPIs were discussed with Outpost24 to see which could be retrieved. The KPIs used
would depend on the dataset. If not enough data could be retrieved for a KPI, then it might not be
possible to accurately display statistics of that KPI. One such KPI that was decided against was
region, since regions would be difficult to define and accurately normalize. Also, only KPIs
found in over 20% of the studied literature will be examined.
2
We will be looking at data from up to five years back, and further back where the dataset allows.
The data available must allow for some statistical accuracy. By looking at intervals of quarters or
years, we can work with larger numbers of data and therefore more accurately represent the
bigger picture. When comparing averages, the most commonly occurring results will be
presented. This will focus the study towards the most relevant but also most statistically accurate
information.
The contents of the report in comparison to time was also a factor. The report could contain more
detailed studies into different cases and trend. The KPIs could also be specified or combined
further for more detail. This was decided as out of scope for our work. The focus was therefore
directed to the general base KPIs.
RQ1 - Which Key Performance Indicators are relevant, with regards to the available data, for
measuring the level of security in organizations?
The first question will identify and present the most common KPIs within vulnerability
management. A study into similar works will be conducted where all possible KPIs are noted.
These will then be filtered into the most common and relevant ones in relation to the available
data. This is the most important question of this work and will lay the base for the other two
questions. State of the art resources from recent years will be used to assure up to date
information.
RQ2 - How can a process be developed that compiles relevant vulnerability reports to the
community?
By using the KPIs found in RQ1, a brief vulnerability report will be created using the Outpost24
dataset. This will involve developing tools programmatically. The steps taken to create this
report will be documented. The expected result is a defined process and tools that Outpost24 can
use in the future to easily create new reports.
RQ3 - Which trends over the last five years, with regard to companies’ vulnerability
exposure, can be found using the extensive dataset from Outpost24?
Finally, the report’s results will be presented together with some analysis. Here the aim is to find
ongoing trends within the data related to vulnerabilities in servers and networks. Plots will be
used to help visualize the results. The results will provide an overview of the vulnerability
landscape and set a starting point for future reports.
3
2 THEORETICAL FRAMEWORK
In this chapter, some general background information will be presented to help understand the
different theoretical concepts in the report. A study into related scientific works will also be
presented. Also, a few commonly occurring subjects from the commercial reports will be
summarized.
A report from NSS [5] highlights that the security area has grown with industries that work with
identifying vulnerabilities, trends and threats in software and systems. The importance of
providing secure systems has become an important condition. Therefore, organizations engage
companies that have this kind of knowledge. Software security deals with assuring that software
is secured from malicious intruders that exploit defects for malign purposes [6]. The main goal is
to protect against threats that violate security properties. A vulnerability is a weakness that can
simply be exploited by an attacker. An attacker performing an exploit is a threat and this poses a
risk. Risks describe the likeliness of an attack together with the consequences if it were to occur.
To secure the system, vulnerabilities need to be remediated on time. Vulnerabilities that are not
noticed or remediated may provide a window of attack that results in damage to the system.
Vulnerabilities could exist because of several factors such as faulty configurations or software
errors, which need to be patched [7]. Other attack vectors that deal with less technical
vulnerabilities is for example social engineering. This means that an employee may be affected
unconsciously from an attacker with malicious intent.
Companies that work to secure systems encounter different vulnerabilities daily and many of
them use scanning tools to find those flaws. In a study of 2009, it was found numerous
vulnerabilities in 300 public websites, this by using a well-known vulnerability scanner [8].
From the study, it is possible to observe the huge amount of vulnerabilities that can be found and
what this data can reveal. To properly track the severity of vulnerabilities, there exists an open
framework called Common Vulnerability Scoring System (CVSS) that provides characteristics
about this kind of vulnerabilities. There are two different versions of CVSS, version 2 and
version 3 [9]. CVSS uses a scoring range between 0 and 10, which gives an idea of the severity
[10]. These scores are often abstracted into low (0-3.9), medium (4-6.9) or high (7-10)
vulnerabilities. Version 3 changes some of the scoring systems and adds the “critical” severity.
(8.9-10). This work will use CVSSv2, as date is included from before the release of CVSSv3.
Information about vulnerabilities can be used for various analysis. One example is the study
[11], where statistical analysis on data from open databases with CVE entries is performed.
Public databases online exist that compile information about CVEs, for example the risks it
poses and how to remediate it. The study mainly looks at the timing of patches and exploits,
4
related to disclosure dates of vulnerabilities. CVE [12] stands for Common Vulnerabilities and
Exposures, which is a standard for gathering and naming vulnerabilities, like CVSS. Each
vulnerability has a CVE identifier which consist of the abbreviation CVE, followed by the year it
was issued to a CVE authority or published. Finally, arbitrary digits that are unique for each
vulnerability, most commonly four digits. Since there exists a vast number of vulnerabilities
there is a need of classifying them by types. The Common Weakness Enumeration (CWE)
presents a hierarchical structure of software weaknesses that are divided in known security flaws
[13]. There are many ways of structuring these. For most vulnerabilities, there exists various
categories and subcategories.
With vulnerabilities being common and often having a large negative impact, there is a need to
gain more knowledge about them and how to combat them. Despite the open standards and the
severity surrounding vulnerabilities, however, there is a lack of research in this area. The field of
security is very fast moving, as is the field of vulnerability research. This also calls for more
recent research regarding how vulnerabilities evolve. There are suggestions for this in [14] such
as for identification of emerging trends, research in the direction of criminology or for
investigation of how different risks change over time.
A study from 2012 [15] looks at risk and security assessment. They describe how metrics can
help business improve their security. An example to better estimate possible impacts regarding
the security is by assigning value to specific risks during a risk analysis. During their analysis,
these researchers uses security metrics with help of KPIs to measure the security performance.
Another paper that take up the importance of security metrics is [16] by comparing different
metrics in software development to ensure a secure system.
Security metrics proves to be one key point in getting secure systems within computer security
because of the contribution [17]. There are also research areas within security metrics for
example finding estimators of system security and to predict the behavior of a system it requires
more research about formal models of security measurements. The measurement of security
could be done with KPIs. KPIs are metrics that evaluate the performance of an organization, and
its activities, over time. They are used to determine the progress towards reaching strategic goals
[18]. KPIs can be low- or high-level, and may be specific to organizations, they are also helpful
when identifying trends.
5
To determine security, measurements can be made based on several factors. These factors can be
the rise of new threats or the emergence of new vulnerabilities [20]. The security measurements
are useful to identify important security risk factors, development of vulnerability trends and
prediction of vulnerabilities in services. The study argues that the metrics of security are great
when determining the effect of changes. For example, changes in the network vulnerabilities
policies, with comparison of metrics before and after a change. Another can be when deciding
which security policy is most effective. Based on the study it concludes that these kinds of
measurements are required to ensure the security of a system. The values from these
measurements can also be presented in a graphical manner, by using plots. In that way, it offers a
better overview of the security landscape. One of the metrics that has been measured in the study
is the accuracy of expected severity per time interval in months. Another was the accuracy of
expected risk per time interval in months. These were presented in plots and showed clearly how
the curves differed.
In one research paper [21], some security metrics has been developed for computer networks by
proposing a model called Dynamic Bayesian Network. The model is closely associated with
CVSS scoring system, however it looks at the whole situation. The study attempts to present a
metric for the combined threat in the case that multiple vulnerabilities exist. As there are often
not just single vulnerabilities, the overall security of a system must be measured in a way that
evaluates the combined risk. It can be obtained by the combination of standard metric values, for
example exploitability, remediation and report confidence, and attack graphs.
Much research goes into detection of vulnerabilities. To understand and ensure the security of
systems, [22] is one study which suggests a method for vulnerability detection. The study
discusses how vulnerabilities become a threat to both the users and providers of various systems.
They point to the growing number of vulnerabilities, and call for tools against this. Their
contribution is a statistical tool for vulnerability detection. An algorithm is developed for
vulnerability detection in the PHP language. This is like our study in that our data also comes
from vulnerability detection, although dynamic. The difference to dynamic is that static deals
with vulnerabilities code, while dynamic looks at running systems. This study is a few years old,
but is still relevant, as seen in the Symantec report studied this year that also points to an increase
of vulnerabilities.
The paper [23], presents a scanner called ZMap which is a modular and open source network
scanner that is used to scan networks. The main function is that it performs Internet-wide
searches, and provides high performance compared to other tools like Nmap, which is a common
tool in networking. In one of the researchers' analysis it was revealed that the tool could scan the
IPv4 public address space 1300 times faster than Nmap, among other advantages. ZMap
provides modular output handlers which are a benefit if the user wants e.g. to add the scan results
directly to a database. When compared to the scanner OUTSCAN that Outpost24 uses, there are
some notable differences. Where ZMap only looks at ports, OUTSCAN goes beyond this
functionality. It can interact with systems and analyze the behaviors of packets sent. This allows
it to detect more details about systems and find possible vulnerabilities.
Today the area has received more attention and is still a subject that is constantly evolving. An
example is web applications occurring in our daily lives and if they grow, it could also mean new
vulnerabilities with new risks. The study [24] shows that it is possible to detect vulnerabilities in
PHP, a language that is frequently used in web applications, by creating a tool with help of their
programming skills and different kind of analysis as data flow and literal. Similarly, to the other
6
paper [22]. However, this study implements a larger tool and focuses on web applications. This
is also a more recent and up to date study. It shows the need for such tools and new research.
The gathered information obtained by the different tools or scans, can be compiled and presented
with the help of graphs. In [25], graphs are used to illustrate which vulnerabilities that need
efforts on patching and configuration errors, since these are those that allows the attackers the
greatest amount of access. The study consists of a dataset from networks that is similar from
ours. The purpose of utilizing these graphs are because of the advantages the defenders can gain.
The primary benefit is to find bottlenecks and secure them, another is to present information in a
graphical way for example list of the most critical vulnerabilities. This is helpful to know which
area requires more efforts based on severity, as computer networks are often large and complex.
Similarly, we also produce some visualization of data by using Python and its libraries. This
study highlights a use for such information and how our KPIs could be used in a practical
scenario.
Briefly, a gathering of vulnerabilities is an important first step, making it possible to find these
through the development of various tools or from already existing scanners. Then there's the
need to compile all the information that is found to reflect the state of current findings. This can
be done by performing measurements of security. This will provide an oversight of the security
landscape. It presents more information of which area requires more resources and which needs
more attention respective prioritization, to always provide strong security to everyone.
The commercial report from Whitehat [26] analyzes data that is scanned or remediated from
different applications used with WhiteHat services. Web applications are a common topic since
it is very popular today. Interesting observations are presented in the report as security
measurements based on key indicators as the likelihood of a given vulnerability class,
remediation rates, time-to-fix, and age for an open vulnerability. Those were presented with the
help of different groupings, such as risk levels which imply critical, high, medium and low.
Others were vulnerability classes and industries. The reports have a good structure that is based
on statistics and conclusions.
Acunetix [27], is a company that also analyzes vulnerabilities in web applications. The data
analyzed are also from a scanner from the own company, their analysis result shows
improvements when comparing two different datasets from two different years. The
7
vulnerabilities are distributed in groupings by category, depending on severity. The severity level
is highlighted as a key factor for measuring the security, and presented in plots.
2.3.2 Malware
Malware was also a common topic that appeared in almost all the reports. Malware is malicious
software and poses a large threat to organizations. For example, one report [28] featured this
topic as a significant threat for companies. Statistics regarding malware showed breakdowns of
samples per platforms, such as Windows, Android and Apple iOS. A conclusion from the plots
presented showed that Windows had the highest rate of events. Other metrics related to malware
are also shown in the report. One example is a bar chart of newly discovered malware per
platform, and its yearly growth. The top malware samples per platform were also listed.
2.3.3 Angler
Angler which is one of the most common exploit kit, is commonly used in different reports. One
of them is the report from Cisco [29], that has used Angler when researching threats. Most of the
verbose information was backed up and visualized by charts, one illustration was a graphic to
show revenue. The exploit kit is used to exploit weaknesses in security holes to infect a user with
malware. Cisco studied this kit in depth. They tracked how often Anglers operators changed IPs
to avoid detection, together with which proxy servers they used. By contacting the internet
service provider, they could work towards trying to shut the operators IP addresses.
Trustwave had published a report [30], within the area of global security. The report contains a
section of threat intelligence and talks about exploit trends and kits among others. From the
report, it appears some exploits that have been performed with help of the Angler. Some trends
show that Java was the component most exploited through exploit kits. Another trend is that
exploit kits are now being offered “as a Service”, as in a provider hosting the kit on his servers
for customers to use. Similarly, one trend is that Angler and other exploit kits started using
encryption. This showcases a move towards a more serious and matured market for hackers.
2.3.4 Ransomware
This topic about ransomware is often found in the security area. Ransomware is a malware that
encrypts data from an infected system. Based on statistics from [31] it appears inside the top 5
varieties within crimeware. The report shows a bar chart where ransomware is the second in the
comparison and it proves how popular that kind of malware is.
In the report from Symantec [32] which analyzes Internet security threats, are several types of
ransomware mentioned and visualized in plots. An example is the total amount of crypto-
ransomware for two different years which shows an increasing percentage growth. Another
observation related to ransomware are Android ransomware, which intimidate users when they
get warnings from FBI on their lock screens, by falsification of Google's design. Symantec
research does a prediction of a peek into the future and it appears that Smart TVs are potentially
vulnerable to this kind of malware among other factors. In conclusion, ransomware an actual and
common topic that has emerged these last few years. Symantec also shows a timeline for
ransomware discoveries from 2005 and eleven years forward, which proves how crowded it
becomes during the last years. Ransomwares presence is growing and it seems to be present in
most operating systems, that has been targeted by this kind of malware.
8
3 METHOD
This chapter will present the methodology of this work. A study into similar works was
performed first, to gain information about the subject and help plan the method. The work was
then based around this background study. One problem was that the other methodologies seen
for creating reports were often brief and basic descriptions. This was an issue with the
commercial papers. By combining these methodology descriptions with sketches of the desired
results, a relatively detailed method could be defined.
A background study was conducted in which existing similar reports were analyzed and
compiled. To ensure that the information was up to date, the scope of the background study was
limited to materials released after 2014. The goal of the background study was the identification
of KPIs. This study focused on similar reports that showcase information about the security
landscape. We looked for KPIs that c be related to security vulnerabilities.
Most similar reports are released as white papers from other companies. The work began with a
study into related scientific works. Some papers that dealt with measurements in security were
found, as presented in the Theoretical Frameworks-chapter. However, works with similar
objectives to this study were not found. Because of this, the study was directed towards
commercial papers. After searching for papers, we found a total of 22 reports from the last two
years that described vulnerabilities. A few of these reports were already known from earlier. An
article was also found that presented some vendors that release security reports [34]. Together
with snowballing within these reports, around half of the papers used were found. The other half
of the reports were found using search engines Bing and Google. The terms used are shown in
Table 3.1. The terms were constructed from common security keywords and by looking at the
naming scheme of the first reports found. Reports chosen were found within the top 50 results.
Search term
cyber security|threat report 2015|2016
security vulnerability report 2015|2016
network security report 2015|2016
vulnerability statistics security report 2015|2016
Table 3.1 Search terms used for finding the commercial white papers.
9
The found reports were annually released reports detailing the security landscape. Since the
contents would vary between reports, selections were made based on their usefulness. The
criteria to include a report were that they had sections detailing vulnerability trends or metrics.
Reports were selected if at least one KPI related to vulnerabilities was present. Other reports
were excluded from the study. Reports using datasets from other types of vulnerabilities were
still included, as these KPIs could most often also be used on our data. The most common type
of data was related to malware, web applications and software. The result of this selection
process was that two reports were disregarded, as they did not contain any KPIs. This left 20
reports for the study, as no other reports were found during the search.
After selection, the reports were examined for useful KPIs. Every KPI identified was extracted
and inserted into a matrix as shown in Figure 4.1. Using this matrix allowed for easy
visualization of the prevalence of KPIs. Every KPI related to vulnerabilities that was found in the
report was inserted into the matrix. Many reports used slightly different KPIs, so similar ones
were grouped together or generalized. This also involved considering if they would be possible
to extract from the available. This was done for example with reports dealing with different
datasets or timelines. The KPIs were then filtered based on prevalence, and a KPI had to have
occurred at least four times to be selected. The result was a collection of general KPIs in a
condensed matrix.
10
When working with the KPIs we need to avoid absolutes. Such numbers could present problems
during the statistical analysis. The report should present data where comparisons are possible.
One example is that while identifying trends, the absolute amounts of data could differ between
times. This is especially true since the dataset originates from Outposts scans. This means scan
frequency from customers may skew statistics. Instead, focus will be on relative amounts and the
KPIs will be converted to such. Using averages and percentages allows for comparisons between
different markets and points in time. All the reports used in the background study are noted in
Table 3.2.
3.2.1 Tools
To perform statistical analysis, we utilized programming and develop custom tools. The
language chosen was Python 3. Both authors had some experience with the language and
Outpost24 also had expertise in it. Using a language that is also used within the company allows
for them to improve and continue development in the future. Python also has access to various
plot-generating tools, such as matplotlib and plotly, to visualize data. There are also tools for
structuring and managing data, with versatile and high-level list functionality or data analysis
libraries like Python Data Analysis Library (pandas).
3.2.2 Dataset
Outpost24 has provided the dataset that this study is based on. The data consists of records from
Outposts network scanner OUTSCAN™. OUTSCAN™ is an automated security scanner that
scans external computer networks and identifies security vulnerabilities. Over time, Outpost24
have gathered a large amount of data from this program, consisting of many million entries. The
timeline of these entries can sometimes be over ten years old. To set the scope of this study,
general analysis will use data up to ten years back (2006). More detailed analysis will only use
data up to five years back (2011). The aim is to focus the study towards recent years while still
allowing comparisons to past situations.
OUTSCAN™ can identify vulnerabilities and then compile useful information about it, by using
both public databases and internal expertise. Each identified vulnerability is accompanied by this
information. The vulnerability information used in this study is presented in Table 3.3.
Datafield
Target Product
Target Platform
Time and Date
Age
CVE-ID
CVSS
Table 3.3 Vulnerability information fields describing the available data.
11
Outside of these specific fields, the data had a lot to do with numbers grouped by time. To make
it possible to generate reports for specific time periods, data was extracted for every month. For
each month, entries were then grouped by indexes, such as CVE-ID or Target Product, to
retrieve the total number of findings for that index and period. This meant that the data
incorporating the indexes and their information, for each month.
After the initial study of KPIs, discussions with Outpost24 personnel were held to extract the
data. The KPI matrix was deconstructed into more detailed queries that would be possible to
extract. Outpost24 received the desired KPIs and metrics. Discussions took place about which of
these would be possible to retrieve, but also which would lead to relevant data. It was also
decided that the data would be grouped by months, so that it would be possible to look at time
periods. By grouping into months, analysis could be performed either each year or quarter.
After having a meeting, Outpost24 provided the data required. This part of the work was
performed by the company. During the extraction, the data was also sanitized from any sensitive
information and anonymized. Since the data originates from customers of Outpost24 and real
organizations, it is important to keep the data anonymized. Any details that could be connected
to organizations were filtered out and extracted data only consisted of what was necessary for the
analysis.
As the original data was stored in an SQL database, the data we received were formatted
accordingly. This meant presenting the relevant data as tables, with rows and columns. The
columns presented the type of data such as type of vulnerability and the amount found. The last
column was the date column. For each row, this column showed from which period the row was
extracted. These tables were often somewhere above six thousand lines back depending on the
number of a specific datatype per month.
12
3.2.4 Data Processing
To conduct the analysis, the data had to be processed correctly. The goal of this part was to
properly arrange and present the data in a way that was possible to analyze. The process was
based on the KPIs, by working towards structuring the data according to the KPIs. The KPIs
provided a view of the desired results and thereby guided the processing. Since the data was in
the form of tables, with varying sizes from 6000 to 40000 entries, programming was necessary to
properly analyze this. The data was provided in the form of files with Comma-separated Values
(CSV). A summary of what the tables contained and how they were structured can be seen in
Table 3.4.
Table Description
CVE Vulnerabilities as CVEs together with their name and CVSS.
Vulnerabilities Number of vulnerabilities for different time periods, also how many
were remediated.
Products/Platforms Number of vulnerabilities in platforms and products, together with
severity level.
Age and times The age of vulnerabilities and the time to fix.
Table 3.4 The tables containing the data that were available and their contents.
The analysis of the data was done programmatically, using the CSV-files as input. The common
python tool Python Data Analysis Library (pandas) was used for the base of the analysis. This
library offers functionality to handle data structured as tables. It also has default support for
common operations such as sorting, selecting and altering columns or rows. Another key feature
was also the possibility to apply custom functions over the whole data structure.
The program was designed to be modular, so that it could easily receive new additions or
changes. First, functions were built for basic functionality. This involved parsing the dates and
tools to manage time periods, which could be used later. When working with averages, the
default types for time periods did not offer support for such large numbers that could occur.
Therefore, custom functions for determining, adding and comparing dates had to be created.
Then, the vulnerability category had to be extracted. The main missing thing from the dataset
were the commonly used vulnerability categories. These are a hierarchical structure that can be
presented in different ways, depending on viewpoints. For this study, the broadest CWE that
incorporated all the present CVEs were constructed. A function was built using a tool called
“cve-search” [50] that could extract information for each CVE. A program was then created to
map the different CWEs into a tree structure, that would display the hierarchical structure of
CWEs. The view Development Concepts [51] was chosen for this study. It splits CWEs into a
hierarchy that connects them in ways commonly seen during software development. The CWEs
were also extracted and formatted through Excel, since the hierarchical nature would otherwise
be difficult to display. Only CWEs that represented over 1% of the total amount were used.
Functions were then built separately for each KPI to extract, process and present the data. The
processing involved identifying desired entities, structuring them and returning it in the form of a
data structure. Data with a low sample size was disregarded from the study. To present accurate
interpretations, the top ten or 20 most commonly occurring data was extracted and then studied.
Targets or vulnerabilities that could not be identified were excluded to avoid uncertain
conclusions. A common ground for the analysis was to begin its function by extracting the data
13
for the correct period, sorting it and displaying the top values. Depending on the KPI, these
numbers were then normalized by calculating the averages. By avoiding absolute values, the aim
was to allow for comparisons independent of, for example, scan frequency or organization size.
The next step was visualization. These functions were built separately from the extraction and
analysis parts, and took the result of the previous functions as input. By building this design,
future works could still utilize base functionality to retrieve the data and then present or further
analyze it. The visualization functions were more specific, since the different KPIs had different
possibilities for visualization and were often formatted differently after analysis. The plots were
also annotated to more accurately display the results.
The key parts to the process definition was experimentation and documentation. As the work
continued towards the goal of producing a report, every step was documented. Throughout the
work period, four main documents were used to track and document what had been done. The
idea was that these would later be used both to compile the report and the process definition. The
notes were structured into the documents presented in Table 3.5.
Report summaries As the background study into commercial reports was conducted,
the contents and structure were also documented and summarized
along with the KPIs.
KPIs Information regarding the KPIs was put into a spreadsheet. This
allowed for easy tracking of KPIs, where they were found, how
often they occurred and what they meant. This is better described
under Identification of Key Performance Indicators. Each KPI
was also accompanied by a definition, to describe their function
and motivate their inclusion.
Report sketch This document was built around our own report, and ideas for
contents. This contained summaries for common practices in
other reports such as sections and additional content.
Tasks completed As work went on, different tasks completed were of course
documented. This would be used both to describe this method but
also the process definition. A simple list of goals was used for
this.
Table 3.5 Documents used to structure information and compile the process definition.
14
The report summaries were constructed first. These were then used to construct a sketch of our
report and provide ideas for what type of content to include. Summarizing large reports required
time and effort, since not all data was relative. Filtering was applied with regards to the scope of
this study to solely include relevant data. This would otherwise sometimes lead to the amount of
information exceeding our scope.
The report sketch was one of the main documents created during this process. It was used to
gather ideas and guidelines observed in relevant literature. This is what created the base of the
report. By coming up with a goal, parts could then easily be added or excluded as work kept
going and the dataset became more familiar. Together with the KPIs, this document provided a
goal of what the resulting report structure. When the necessary components had been identified,
work could be focused on specific parts. After the sketch was created, it was expanded into a
brief plan that better specified what to do and in what order.
Each step was often discussed to best know how to proceed. This could involve coming up with
two solutions to a problem and then selecting the best one. This often occurred over the whole
course of the project, especially relating to choices when performing the background study or
when developing the program. The results were compiled into the tasks completed document.
This acted as a natural way to track the overall process, but would also help in defining process
itself later.
As described above, lots of notes were written down regarding the process for use later. Some of
these notes also involved suggestions for contents of the report. When combined, these led to a
rough sketch on how the report could be formatted. For example, some headlines were written
down together with things that could be incorporated into the report. These originated from the
background study commercial reports and the most commonly seen headers were chosen. Since
the study would be based around KPIs and statistical analysis, this was the key result of our
report. Sections that would not fit into this subject were filtered out. The desired output in the
form of KPIs was already known. This provided a goal to work with and a target of what to
include in the result part of the report. The different sections of the report together with a
summary of the notes can be seen in Table 3.6.
After the program and planned outline of the report was created, the program was run to generate
the plots as pictures. These pictures were then put into the report and the rest was built around
these. By focusing on the KPIs and plots, the report would be based on facts and remain
relatively simple. Each of the plots received a category and an explanation. In parts where
necessary, such as the CVE does not offer a descriptive name, the top results were given a brief
explanation. This would help readers understand what the actual effect would be. Where
possible, some obvious conclusions were drawn. The report was then improved by working with
representatives from Outpost24, to meet their expectations. This also involved adding some
design and descriptive texts.
15
Section Description
Preface A simple preface presenting the company, report name and
year/quarter.
Introduction An introduction to the report, the company and data source.
Method A very brief description of the method, dataset and other
considerations.
Table of contents Can be included if the report is long enough.
Result The main part of the report. This will be structured into minor
sections for each KPI. Some similar KPIs may be grouped
together to make the text less complex.
Result – KPIs These sections will contain visualized plots together with
basic a simple explanation. Where needed, further
descriptions will be added to help explain the results. One
example would be to describe the effect of the most common
vulnerability.
Final words Some conclusion for the report and the situation.
About the report Some additional information about the report together with
the purpose, authors and company.
Contact information Common information about the organization.
Additional content If time was available, additional contents may be described.
This could be some minor cases or news articles about major
events in the security landscape. These would preferably be
related to the dataset of the report (networks) or the company
(Outpost24).
Table 3.6 Planned sections of the report.
One thing of importance was that the plots remained easy to understand and explained. Some of
the studied literature sometimes required some thought to understand. The goal was to mostly
avoid this where possible and to create a simple but informative report. A description of the
method was also presented. Even if not of interest to the typical reader, this was one of the
problems seen in the background study. If the method of a report is not explained, it is difficult
to trust the results. The method and description adds legitimacy to the report and the results they
convey.
As the report was created using both the defined process and developed tools, this allowed for a
test of these both. This acted as a final test of the proposed process. As the KPIs were extracted
and put into a report, this could be compared both to other similar works and the report sketch
created previously. Both factors allowed for a quick evaluation of the work performed and
proved that the method was successful. Asking for feedback from Outpost24 was the last step in
this validation process. The company was happy with the results.
16
3.5 Validation
During the work, techniques for validation were considered. These were divided into internal and
external validation.
Tests were conducted during and after the process definition. These revolved around using the
method together with the tools to create reports that presented the correct KPIs. This was done
because of the lack of similar works to which the process could be compared to. The dataset
from Outpost24 is also specific, and therefore might require a specifically created process.
17
4 RESULTS
Below the results are presented. The results are based on the background study performed, which
later paired with the dataset from Outpost24 to present relevant findings today.
The result from reading the commercial reports during the background study was compiled into a
matrix, Figure 4.1. The matrix contains KPIs that were found for each respective report. The
matrix would allow for storing and counting of KPIs found during the read text. The outcome
was a delineation of reports that had almost the same or similar KPIs, but also those that had
unique ones. The KPIs that were chosen are those who are represented in the matrix at least four
times. The sum- column and row denote the total number of occurrences.
Figure 4.1 Matrix of the chosen KPIs and the reports in which they were found.
18
After the matrix was created a final row is inserted for easy comparison to our report. These are
the KPIs that were used in our study. Our report also included another suggestion for a new KPI
that was not seen in our study: the average CVSS score of vulnerabilities found.
The selected KPIs are arranged to best suit the data we are working against. Thanks to the
selected KPIs the first research question could be answered. Those indicators that are listed
below are relevant to measure the level of security in organizations. This has also been proved by
usage of those when measuring the level of security of Outpost24.
4.1.1 Severity
From this KPI the result will be to obtain the average number of vulnerabilities that will be split
per severity level. Theses consist of low, medium and high. This measurement shows the result
of which severity level is most common and had more attention. KPIs could be used to oversee
the severity of vulnerabilities and prioritize the remediation of higher ones. Severities could also
be used together with other KPIs to set different goals based on how severe a vulnerability is.
4.1.2 Vendor/Platform/Product
An overview of the average number of vulnerabilities per specific platforms, vendors or
products. This will help demonstrate which programs might be more vulnerable. By finding the
edge cases, organizations can watch those prone to vulnerabilities more closely. KPIs could be
used to measure the number of vulnerable programs and attempt to keep them below certain
numbers.
4.1.5 Remediation
The difference from the above metric is that this will look at the timeline of vulnerabilities.
Mainly the average number of days before a vulnerability has been remediated. This could also
be combined with different severities, platforms or products to see how remediation times might
differ between these. This is an important part for organizations since it shows for how long
systems might be vulnerable to attacks. A KPI could be constructed to keep the number of days
as low as possible.
19
4.1.7 Total Vulnerability Growth
The total number of vulnerabilities will show the growth of vulnerabilities over time. A KPI
could be created to avoid this number rising. While providing similar functionality as the number
of vulnerable targets, the total growth of vulnerabilities can provide a more detailed look into
how severe the situation is. For a target to be vulnerable, only a single vulnerability would need
to be present.
4.1.8 CVSS
To more accurately measure the growth of vulnerabilities, our report adds a CVSS time series.
This will show the overall trend of vulnerability severity. It could also be used by organizations
as a KPI to measure how the threat level changes. Depending on how high this number becomes,
there might be a need for more resources towards increasing security.
The first thing would be to examine the KPIs and the matrix. At a suggested interval of every
second or third year, a quicker background study would be redone. This is to ensure staying up to
date and having relevant, comparable KPIs.
The KPIs should then be examined towards the dataset. This is to assure that the KPIs are still
relevant regarding the available data, but also to see if new metrics could possibly be added.
Mapping the KPIs into the dataset was an important task, since there might be multiple ways to
interpret and present the results.
Data would then be extracted in the same way as done for the original study. Outpost24 managed
this task and created queries, that were stored internally to allow for the possibility of repeating
the work. This data had to be structured in a specific way for it to work with the program. The
tables and columns should match the tables and columns attempted to be parsed by the program.
The program could then be run on the data in the capacity required. The manual input would be
to select the desired year to examine. Another choice would be if the data should be presented as
plots, simple outputs or written into a CSV-file.
Finally, the report itself would be created. Together with the first report as a guideline, other
suggested contents of the report area also attached to the process definition. A template in .doc
format is also created to further ease the report creation.
20
4.3 Observed Trends in Vulnerabilities
As to conclude this work, a suggestion for a report was created. This would allow for Outpost24
to have a basic report structure and content, but also to add analysis to this study. The results of
the statistical analysis will therefore be presented below, together with explanations of each plot.
These are the statistics and plots that were present in the report created for Outpost. They are
grouped by the KPIs that they relate to. The data is from 2016 unless stated otherwise. The time
series plots contain data from the past five or ten years. The numbers are normalized to allow for
comparisons without interference of varying number of data.
4.3.1 Severity
This KPI, in Figure 4.2, shows the average severity level of vulnerabilities. This means that out
of every vulnerability seen in 2016, over 75% were rated medium and 16% were rated high.
21
Figure 4.3 focuses on the average severity over time. Looking at the proportions over time, we
can see changes in severity trends. As of the last two years, low severities are decreasing in
relation to high and medium ones. The data is extracted for each month back to the end of 2011.
Proportion (%)
Year-month
22
4.3.2 Platform
Platforms are a notation for the operating system ran on a target. In this section, the average
amount of vulnerabilities of these platforms is examined. The top 10 most seen platforms were
selected for this analysis. The per Figure 4.4 displays the average amount of vulnerabilities for
these platforms. The X-axis describes the proportion of platforms that had vulnerabilities found
in total. The bars are also separated per severity level proportions. The results are that most
vulnerabilities are seen in Microsoft Windows Server, which showed that 34% of these targets
contained vulnerabilities. The same platform was also the most affected by high severity
vulnerabilities. Red Hat Enterprise Linux followed with 23%.
Proportion (%)
23
Figure 4.5 brings some comparison in relation to the previous plot. Showcasing how common
each of the top platforms were when compared to each other. The two plots are mostly reversed,
with the most vulnerable being less common in actual usage. This might also show some insight
into the attack surface. Generic Unix is seen to be the third most common platform in both plots.
Proportion (%)
24
4.3.3 Vulnerable Targets
Figure 4.6 further shows the attack surface of targets. On the average of all the targets present,
close to a third were found to contain vulnerabilities.
The same statistic is presented in Figure 4.7, with additional data from 2011 to present day. This
shows how the surface changes over time.
Proportion (%)
Year-month
25
4.3.4 Remediation Percentage
Figure 4.8 describes the percentage of risks remediated at all per month. This shows that the
percentage of fixed vulnerabilities has remained rather fixed over time, staying around 20 to
30%. However, there are noticeable spikes for some periods where a high amount of
vulnerabilities was fixed at once. This plot is not dependent on the number of vulnerabilities, but
rather the proportion remediated.
Proportion (%)
Year-month
26
4.3.5 Remediation
Figure 4.9 displays the average time to fix for each severity, and the combined average. It is
notable that the High vulnerabilities are fixed within one month, while the others take upwards
of two.
Days
The top 20 most common products and how common they are presented in Figure 4.10. It is
shown that Microsoft/Windows quickly can remediate vulnerabilities, on average over two
weeks. These results are independent of severity. The five fastest and the five slowest times of
these top 20 most common products are presented.
Proportion (%)
27
Figure 4.11 shows the average time of remediation over time. This will show the general trend of
remediation times in the form of days, independent of severity.
Days
Year-Month
28
4.3.6 Vulnerability Category
The categories are presented as CWEs in Figure 4.12. To properly relay this hierarchical
structure, a sunburst plot was created. This is because a category can be a member of another
category. Only the CWEs that occur more often than 1% have been chosen for this visualization.
The most common CWEs and what they mean are explained separately.
Figure 4.12 Vulnerability categories for the observed CVEs, based on percentage.
29
CWE-20 Improper Input Validation
Improper Input is problems in any system that receives data from an external sauce. Its
likelihood of exploit is rated as high. Subcategories involve different types of injections and
Cross-site scripting.
30
4.3.7 Total Vulnerability Growth
Figure 4.13 displays the relative growth of vulnerabilities over time. By setting a baseline during
the current year (2016), the average amount of vulnerabilities for different periods can be
compared.
Relative Proportion
Year-Month
Figure 4.13 Relative vulnerability amount over time in comparison to the 2016 baseline.
31
4.3.8 CVSS
The average CVSS over time is displayed in Figure 4.14. This shows how the average CVSS of
identified CVEs changes over time.
CVSS Score
Year
32
4.3.9 CVEs
A total of 6839 different CVEs were present in systems during 2016. The most common CVEs
overall are listed in Figure 4.15. This shows the most common CVEs being the two following,
together with a brief description of them. While CVE numbers do not necessarily count as a KPI,
it is still an interesting statistic regarding vulnerabilities. Organizations can use this to protect
themselves against common attack vectors.
CVE-2013-2566
A medium-severity vulnerability, pinpointing the weakness of the RC4 cipher used in some
TLS/SSL configurations.
CVE-2014-3566
A medium-severity vulnerability. Related to a flaw in SSL 3.0, this is also known as the
POODLE vulnerability.
Proportion (%)
Figure 4.15 The top ten out of the total all found CVEs.
33
The five most common High severities are presented in Table 4.1 below. These are not common
enough to appear in the top 10. Table 4.2 shows the five most common severities that are over
ten years old. Four of these appear as the top ten most common CVEs.
34
5 DISCUSSION
In this chapter, the results will be analyzed and discussed. The work will be examined based on
the original objectives. Time will also be spent towards discussing factors that could impact the
results.
Effort was put into making the work adhere to ethical aspects. While working with data from
customers, all data handled had to be anonymous to respect the privacy of organizations. By
working with normalized averages, the results accurately represent the situation independently of
the number of scans or data. Work was also performed impartially together with Outpost24 and
the methodology used to present the result is presented in detail. This adds transparency to the
study, and many related studies did not go into detail about methodology.
Working with vulnerabilities also aims to help spread information to organizations. This could
lead to more organizations becoming aware of vulnerabilities and allow for increased security.
However, a malicious user could also use the information to come up with an attack vector.
While this is a risk, the benefits of organizations being aware of vulnerabilities is important.
Ultimately, such information could be used to increase the security and reliability of public
systems. As the systems and vulnerabilities handled face public networks, it is important to
notify and help maintain security. Faults could have the potential to negatively affect social
functions. The results in this study therefore aim to improve sustainable development. The work
itself and the process defined was also general to fit into this. The results were aimed to allow for
easy report generation in the future. By keeping the process and KPIs general, it also allowed for
flexibility and simple changes in the future.
As the work was performed in cooperation with Outpost24, their feedback was also requested.
After finalizing the work and presenting for the company the work received positive feedback.
As Outpost24 was satisfied with the results, this was also added to the validation of the work
performed and the product delivered.
The KPI Matrix created from the background study provides an overview of the most commonly
used KPIs in the industry. This matrix presents a simple visualization of the results from the
study. The matrix originates from reports published by well-known companies within the area.
By presenting metrics as KPIs, this data can help readers assess their own security situation. As
there are multiple KPIs, multiple ones could be combined to create a baseline. Future
comparisons towards this baseline could show the trend of the security level for organizations.
Due to the lack of academically published papers on the subject, the report adds valuable
knowledge to the area of vulnerabilities and vulnerability metrics. This was also an area where
there were calls for more research, as described in chapter 2. This allows for future comparisons
between our work and future studies in the subject.
35
As seen in the matrix, the most commonly used KPIs were related to vulnerability categories and
the overall vulnerability amount. These are also very general and could be extracted from
varying datasets. In our report, seven out of the most common KPIs are used. This results in the
report compiling and focusing on the most common KPIs out of all the studied reports.
Therefore, the report hopefully appeals to the larger audience, even though minor detailed KPIs
may be missing. Such KPIs depend heavily on the dataset available and are therefore difficult to
take from another report.
As the KPIs build on state of the art research from well-known companies in the area, we believe
this provides an accurate view of the most commonly used KPIs. No reports found during the
search were excluded from the study, which could eventually skew results based on the criteria
for selection. The only reports not used were those who did not feature any KPIs.
Literature regarding vulnerabilities in network devices were uncommon, and therefore the KPIs
had to be generalized slightly. Most reports had some specific KPIs, which are not present in our
report. These might describe the security situation in more detail. Therefore, our report may lack
some of the more detailed knowledge and statistics.
The program developed in Python offers the functionality of automation. Together with the
defined structure for both input and output, it provides a simpler way to repeat the task of
generating the report. This offloads the main workload of future work while at the same time
helping define the process as a necessary step. Since the code is written in Python, it can easily
be further developed and ran again without much delay. By also offering the choice to create
plots, there are possibly other ways to utilize this program. This allows for a flexible process
where some parts can be built on or changed if there is a need.
The guidelines detail the work performed in detail. By having performed the work and research
once already, the workload here will be simply to follow the steps once again. The guidelines
include suggestions for improvement or changes, that were maybe out of the scope. This allows
for easy improvements or additions by future analysts if desired. By keeping the process
relatively brief and simple, but still providing some suggestions for improvements, the process
will be beneficial in more situations.
To conclude the study a first report was generating using the tools and guidelines defined during
the process. By defining the desired results and KPIs in the beginning of the work, a clear goal
was set. As the first report matched these expectations, we can conclude that the process created
was successful. Using Outposts dataset, we could create a first report that matched expectations
from the authors and Outpost24.
36
5.4 Observed Trends in Vulnerabilities
As part of the process included creating a first version of the report. Using the dataset of
Outpost24 from 2016, this task was completed. The identified trends are presented in the results,
visualized using graphical plots. Below will follow some comments of these about what trends
were observed and what could be found. To keep the analysis general and avoid too many
uncertainties, plots will be examined for general trends. By excluding uncertain or unknown data
we focus on accurate statistics. Working together with a global company such as Outpost24 in
presenting this data, two parties are involved in making this work successful. This data is also
based on Outpost24s own dataset which is familiar to the company. The point of presenting
accurate statistics was an important part for both parties. By avoiding absolute numbers and
working with normalized averages, numbers can accurately be presented. This was done to
eliminate data being skewed depending on the number of hosts or scans performed.
Much of trends seen look negatively on the evolution of vulnerabilities. More vulnerabilities are
found and they are becoming more serious. This calls for the need of increased resources spent
on security. One important thing to note is that most of these vulnerabilities are available to the
internet, as the dataset deals with networks. This makes the situation direr since an attack could
happen from anywhere. In the study of commercial reports, the same theme of an increasing
number of vulnerabilities were often found.
It is difficult to properly represent the whole situation for vulnerabilities. Data can vary
depending on multiple factors. The dataset in total includes over 43 million instances of
identified vulnerabilities, but these may still not accurately display the situation for everyone.
One fact of this data is that it originates mainly from Europe and is related to vulnerabilities in
network-connected devices. It was found that this was enough to be a relatively new and specific
area of research. In the future as more reports are published, further comparisons could be
performed.
A few factors are that organizations connected to Outpost24 may be investing in their security.
This could possibly lead to scanned data being more secure than others. However, conducting
studies on uninterested organizations may be resource intensive or even unethical. Another factor
could be that large companies focus more on security, and therefore look more secure. These
large companies may also present issues for the statistics if they have many devices. Normalizing
depending on the number of devices per company requires more advanced studies and discussion
into how this normalization may be done. It will also mean less privacy for the organizations. In
this study, totally anonymous data was used which leads to a general overview of the security.
5.3.1 Severity
The different severities showed an uneven distribution. Medium made up most vulnerabilities,
with the High severity ones being second. The combined percentage was above 92%. This meant
that most vulnerabilities identified in the dataset were serious considerations. Security
vulnerabilities need to be combated. Even if only medium severity, this could still be damaging
to a company. Looking at the trends over time, we see that low vulnerabilities are decreasing
lately, while medium increases. High has been at a steady percentage for the last few years and
might stay so.
37
5.3.2 Platform
By looking at platforms, Windows Server has the most occurring vulnerabilities. This in both the
general case and the high vulnerabilities. However, personal Windows systems seem more
secure. This might be since content is often hosted on servers that are exposed to the network.
There is also a slight connection in that the most common platforms seem to be those with fewer
vulnerabilities.
5.3.5 Remediation
The remediation times were where some more positive news came to light. We see that
organizations are working hard to fix high-severity vulnerabilities quickly. These are fixed
within less than a month in comparison to the others (closer to two months). This means focus is
in the right way in this field. However, a month is still a large amount of time for a high-severity
vulnerability.
The remediation of products shows how vendors are good with reaching to with easy to apply
updates and solutions to problems. Windows, with its built-in update policy, is here shown to be
one of the faster systems. Other different technologies take a longer time for vulnerabilities to be
fixed. For example, an SNMP, Java or SSL vulnerability often takes over three months to
remediate.
Over time, the vulnerability remediation also displays positive news. Vulnerabilities seem to
overall take less time to remediate. This plot has some deviations and uncertainty, but during the
last year at least vulnerability remediation times have gone down overall.
38
the previous years were possible to compare. The average amount of vulnerabilities found was
then compiled into the plot. The overall result was a steady increase of vulnerabilities seen,
especially during spring 2015.
5.3.8 CVSS
The CVSS was a new addition which was an easy addition since access to CVSS was already
part of the dataset. The idea was that this would in more detail show slight trends over time in
vulnerability severity. This also was successful, since one could observe a slight upwards trend
over the last years. This could also be seen in the severity category, where the medium
vulnerabilities were increasing at the behest of low severity ones.
5.3.9 CVEs
While not necessary a KPI, interesting statistics of CVEs was presented. This is for some
interesting reading but also keeps the readers aware of the most common vulnerabilities. For
common vulnerabilities, there may also be a larger number of potential attackers. Therefore, it is
necessary to be aware of these in able to respond to threats.
The type of network vulnerabilities was also less common, with the major types being web
applications or software. Comparisons between these may not always be accurate. For example,
remediation a vulnerability in an application may often mean that the source code must be
corrected. This will take longer to fix when comparing to the most common way of fixing
network vulnerabilities, which is to simply download and apply a vendor patch. Some reports
also looked at the general situation for CVEs. The number of possible vulnerabilities for
different platforms or categories. This study uses Outpost24’s internal dataset that represents
actual number of vulnerabilities in live systems.
Below is a brief attempt at finding some common occurrences between the KPIs.
5.4.1 Severities
While looking for comparable results to ours, we see that Acunetix [27] presents similar
findings. The amount of medium-severity vulnerabilities is more than double that of high.
However, they do not display low severity. Others either only look at the amount of high or
critical vulnerabilities, while some use custom metrics such as “serious vulnerabilities”. The use
of CVSS version 2 versus CVSS version 3 is also problematic here, as this study only had access
to CVSS v2 scores.
39
5.4.3 Remediation Percentage
These reports mostly looked at applications and it is difficult to accurately generalize between
these as the remediation process is different. Some looked at remediation of specific
vulnerabilities.
Edgescan [36] dealt both with network vulnerabilities and applications. They showed an average
number of days for remediation. A similarity was that the higher vulnerabilities saw a quicker
response, which is also present in our results.
5.4.5 Vendor/Platform/Product
This was used in several ways in reports. Some looked at products from other types and others
looked at specific vendors only. One similarity was that Microsoft topped lists for security
related events [35], which was also the result of our platform analysis. We also noted that
Microsoft was the most common platform, which may explain their occurrence in other reports if
these have not normalized their numbers the same way. This area was very broad and reports
looked specifically at data available to them.
5.4.6 Regions
This KPI was popular within the reports, an example is the report from Microsoft [43] who
presents regional infection rates by country/region. Unfortunately, this KPI was excluded from
our study. It is difficult to present this data accurately and in an interesting manner. To keep the
statistics normalized, we must also somehow consider the amount of data for each region, so that
the region with the most customers are not skewing statistics.
Another difficulty is in the definition of regions in general. Too detailed region descriptions
could violate the privacy of customers. However, it would be interesting to examine in the future
how different regions or countries compare security wise.
5.4.7 Categories
The most common metric used in reports. However, also very difficult to properly compare. The
first issue is the difference in the dataset, and there is also an issue with representing categories
in a standard way. Most web applications used the common Open Web Application Security
Project (OWASP) Top 10-hierarchy, for example WhiteHat [26], which does not match well
with network-vulnerabilities. By using CWEs and the most comprehensive view (Development
Concepts), we allow for future comparisons to our data in a common way.
40
6 CONCLUSIONS
To conclude this work, a study of KPIs took place. These were visualized in the format of a
matrix were the most common KPIs were displayed together with reports part of the background
study. A simple visualization was made to see which KPIs were most common and which were
later used in our analysis. Together with this matrix, short descriptions of the KPIs were
presented to help readers better understand their meaning.
The second part was the completion of a report by following a newly defined process. This
process allows the company to release reports in the future. To help make this process easier, a
program was written to automate analysis of the raw data. Guidelines were set up in how to redo
a simplified version of the work performed in this project and generate a new report.
Finally, the data was examined using the dataset from 2016. This data was visualized in a
graphical format using plots. Analysis showed the data from this year, but also how it changed
over time. This provides an overlook into the vulnerability situation for data available to
Outpost24. The overall outlook is that the vulnerability landscape is becoming more serious, and
threats are increasing. The good news is that vulnerabilities are at least somewhat remediated
faster. The data shows the need for vulnerability management in organizations to minimize risks.
Simple solutions such as patching should be present and resources should be put into IT-security.
The work has also helped contribute to an area of research that needs more materials. The
situation is constantly evolving in today's connected society and continuous studies into trends
helps show this. By creating a report based on KPIs the work could present new findings that
could be of interest to the authors, Outpost24 and readers interested in the subject. The authors
also had trouble finding scientific papers on the subject. This work thereby provides new
knowledge about vulnerability statistics research.
41
7 RECOMMENDATIONS AND FUTURE WORK
This work provides an overview of an underrepresented field in research. More studies are
necessary to allow for deep comparisons. By performing more studies and comparing with this
work, the results could be further tested and validated. Building on this study would also mean
identifying possible areas of improvements. Working with different datasets is also a natural
continuation of this work. If of the same type as the original dataset, comparisons could be made
that would lead to better generalization and accurate representation of the overall situation. The
process created in this study can also be repeated on the same dataset in future years, as more
data is added.
To receive a more detailed report, individual studies into the results extracted from each KPI
would be recommended. The goal would be to examine the results of each KPI and attempt to
find causes for different behaviors. This would require more time after the actual data had been
extracted and some familiarity with the base data. However, it would also add legitimacy to
results by properly detailing their explanations. A more detailed report could also further
consider adding custom KPIs or metrics related to the dataset. For example, this could be
achieved by combining KPIs such as splitting them by different severities.
A study into web applications may be of interest in the future. This was a common topic in the
studied commercial report. If no dataset for these existed, a comparative study between other
reports could be undertaken to try and find comparisons between these. For example,
comparisons between different regions or businesses and how vulnerabilities could affect these
in different ways. By then examining the reasons for these results, one could identify how to
better secure their digital presence.
42
8 REFERENCES
1. “CVE Details.” [Online] Available: http://www.cvedetails.com/ [Accessed: May 7, 2017].
4. A. Algirdas, J-C Laprie, B. Randell and C. Landwehr. "Basic concepts and taxonomy of
dependable and secure computing." IEEE transactions on dependable and secure computing 1.1.
2004.
5. S. Frei. (Feb 2013). Brief Analyst. Vulnerability threat trends. [Online]. Available:
https://www.researchgate.net/publication/258112783_Vulnerability_Threat_Trends_-
_A_decade_in_review_transition_on_the_way
6. G. McGraw, "Software security," in IEEE Security & Privacy, vol. 2, no. 2, pp. 80-83, Mar-Apr
2004.
8. M. Vieira, N. Antunes and H. Madeira. "Using Web Security Scanners to Detect Vulnerabilities
in Web Services." Dependable systems & Networks. 2009.
10. P. Mell, K. Scarfone and S. Romanosky. "A Complete Guide to the Common Vulnerability
Scoring System Version 2.0." Published by FIRST-Forum of Incident Response and Security
Teams. 2007.
11. S. Frei, M. May, U. Fiedler and B. Plattner. "Large-Scale Vulnerability Analysis." Proceedings of
the 2006 SIGCOMM workshop on Large-scale attack defense. ACM. 2006.
14. KKR. Choo. "The cyber threat landscape: Challenges and future research directions." Computers
& Security. 2011.
15. Y. Shi and Q. Wen. "A Value based Security Risk Assessment Method." Multimedia Information
Networking and Security (MINES). 2012.
16. D. Mellado, E. Fernández-Medina and M. Piattini. "A Comparison of Software Design Security
Metrics." Proceedings of the Fourth European Conference on Software Architecture: Companion
Volume. ACM. 2010.
43
17. W. Jansen. "Directions in Security Metrics Research." Diane Publishing. 2010.
18. D. Barone, L. Jiang, D. Amyot and J. Mylopoulos. "Reasoning with Key Performance
Indicators." IFIP Working Conference on The Practice of Enterprise Modeling. Springer Berlin
Heidelberg. 2011.
19. R. Von Solms and V. Niekerk. "From information security to cyber security." Computer &
Security. vol. 38. 2013.
20. M. S. Ahmed, E. Al-Shaer and L. Khan. "A novel quantitative approach for measuring network
security." INFOCOM 2008. The 27th Conference on Computer Communications. IEEE. 2008.
21. M. Frigault, L. Wang, A. Singhal, S. Jajodia. "Measuring Network Security Using Dynamic
Bayesian Network." Proceedings of the 4th ACM workshop on Quality of protection. ACM. 2008.
22. Y. Xie and A. Aiken. "Static Detection of Security Vulnerabilities in Scripting Languages."
USENIX Security. 2006.
23. Z. Durumeric, E. Wustrow and J. Halderman. "ZMap: Fast Internet-wide Scanning and Its
Security Applications." Usenix Security. 2013.
24. N. Jovanovic, C. Kruegel and E. Kirda. "Pixy: A Static Analysis Tool for Detecting Web
Application Vulnerabilities." Security and Privacy. 2006.
25. K. Ingols, R. Lippmann and K. Piwowarski. "Practical Attack Graph Generation for Network
Defense". Computer Security Applications Conference. 2006.
26. WhiteHat. (2016). "Web Application Security Statistics Report 2016. [Online] Available:
https://info.whitehatsec.com/rs/675-YBI-674/images/WH-2016-Stats-Report-FINAL.pdf
[Accessed: May 2, 2017].
27. Acunetix. (2016). "Web Application Vulnerability Report 2016. [Online] Available:
https://d3eaqdewfg2crq.cloudfront.net/resources/acunetix-web-application-vulnerability-report-
2016.pdf [Accessed: May 2, 2017].
28. HPE. (2016). "HPE Security Research Cyber Risk Report 2016." [Online] Available:
https://www.thehaguesecuritydelta.com/media/com_hsd/report/57/document/4aa6-3786enw.pdf
[Accessed: May 2, 2017].
29. Cisco. (2016). "Cisco 2016 Annual Security Report." [Online] Available:
http://www.cisco.com/c/m/en_us/offers/sc04/2016-annual-security-
report/index.html?KeyCode=001031927# [Accessed: May 2, 2017].
30. Trustwave. (2016). "Trustwave Global Security Report 2016." [Online] Available:
https://www2.trustwave.com/GSR2016.html?utm_source=library&utm_medium=web&utm_cam
paign=GSR2016 [Accessed: May 2, 2017].
31. Verizon. (2016). "2016 Data Breach Investigations Report." [Online] Available:
http://www.verizonenterprise.com/resources/reports/rp_DBIR_2016_Report_en_xg.pdf
[Accessed: May 2, 2017].
32. Symantec. (2016). "Internet Security Threat Report 2016." [Online] Available:
https://www.symantec.com/content/dam/symantec/docs/reports/istr-21-2016-en.pdf [Accessed:
May 2, 2017].
44
33. G. Sanjay and H. Shawky. "Estimating the market impact of security breach announcements on
firm values." Information & Management. 2009.
34. S. Morgan. (2016). "Top 2016 Cybersecurity Reports Out From AT&T, Cisco, Dell, Google,
IBM, McAfee, Symantec and Verizon." [Online] Available:
https://www.forbes.com/sites/stevemorgan/2016/05/09/top-2016-cybersecurity-reports-out-from-
att-cisco-dell-google-ibm-mcafee-symantec-and-verizon/#6990fa171caf [Accessed: May 2,
2017].
38. Eset. (2016). "Trends 2016 (IN) Security Everywhere." [Online] Available:
https://www.welivesecurity.com/wp-content/uploads/2016/01/eset-trends-2016-insecurity-
everywhere.pdf [Accessed: May 7, 2017].
45. Radware. (2016). "Global application & network Security report." [Online] Available:
https://www.radware.com/documents/infographics/2016-2017-global-application-network-
security-infographic/ [Accessed: May 7, 2017].
45
47. Telstra. (2016). "Cyber Security Report 2016." Trends in Cybersecurity. [Online] Available:
https://www.telstra.com.au/content/dam/tcom/business-enterprise/campaigns/pdf/cyber-security-
whitepaper.pdf [Accessed: May 7, 2017].
46
Blekinge Institute of Technology, Campus Gräsvik, 371 79 Karlskrona, Sweden