Explainable Artificial Intelligence and Cybersecurity: A Systematic Literature Review
Explainable Artificial Intelligence and Cybersecurity: A Systematic Literature Review
Explainable Artificial Intelligence and Cybersecurity: A Systematic Literature Review
• Lost intellectual property (including trade secrets) • Web Domain and Reputation Assessment
• Disruption or damages to critical infrastructure Unfortunately, there are AI techniques whose operation is
• Cost of outside consultants and experts not transparent to the user and which also do not provide
• Lost revenues explanations on how they arrived at the generated result, as
Cyber incidents are touted as the biggest risk factor for is usually the case with neural networks, for example. These
business in 2022, according to research conducted by insurer are called “black box” AI techniques. It turns out that a better
Allianz, even ahead of business interruption, natural catastro- understanding by technology operators is desirable as it allows
phes, and pandemic outbreak [3]. greater [10]:
Cybersecurity spans incredibly diverse specialties. The • Trust in its decisions
(ISC)² (International Information System Security Certification • Social acceptance
Consortium), maintainer of the CISSP(Certified Information • Ease of debugging and auditing
Systems Security Professional) qualification, projects its exam • Fairness (by the ease of bias detection)
in 8 domains: • Assessment on the relevance of learned features
With this problem in mind, the concept of XAI (eXplainable In addition, some secondary questions were elaborated,
Artificial Intelligence) was derived, whose goal is to make the seeking to give more details to the research scenario in the
operation of an AI algorithm more understandable for its users area:
and developers. XAI can be defined as the set of AI methods 1) Which countries do most research on the subject come
capable of conveying to a suitably specialized observer how from?
they arrived at a classification, regression, or prediction. The 2) What is the frequency of published studies on the
discussion about what constitutes “understanding” is heated. subject?
There is no peaceful and widely accepted definition, but 3) How are studies in the area divided by type of publica-
attempts of formalization have already been well developed tion?
since at least [11]. For this discussion, the reading of [12] 4) Which authors and institutions publish the most on the
may be relevant. topic?
Some features are particularly desirable in an XAI applica- 5) What domains of cybersecurity have already benefited
tion [13]: from XAI research?
• Understandability 6) Why is security analysts’ ability to interpret AI cyber
• Fidelity: reasonable representation of what the AI system risk classification important?
actually does. 7) How are techniques evaluated?
• Sufficiency: detailed enough to justify the AI decision. 8) What are the limitations of current techniques?
• Low Construction Overhead: not dominate the cost of The repositories of scientific articles in which to conduct
designing the AI. the search were defined based on the prevalence of use
• Efficiency: not slow down the AI significantly. by researchers of information technologies, in addition to
With regard specifically to the application of XAI to cy- allowing access via the Web and search queries:
bersecurity, [14] and [15] address this matter in a high-level 1) Scopus (http://www.scopus.com/home.url)
way, proposing a so-called desiderata for the area and general 2) ACM Digital Library (http://portal.acm.org/)
architecture that can serve as a roadmap for guiding research 3) IEEE Xplore Digital Library (http://ieeexplore.ieee.org/)
efforts towards the development of XAI-based cybersecurity Only articles written in English were considered in the
systems. scope of this work. In addition, books and panels were
One way XAI algorithms can be classified is on whether disregarded.
interpretability is achieved by restricting the complexity of A set of keywords was generated, among them the more
the machine learning model (intrinsic) or by applying methods general: “explainable artificial intelligence” and “cybersecu-
that analyze the model after training (post hoc). Furthermore, rity”, and its variation “cyber security” (the forms “cyber-
depending on the scope of interpretability, they can be clas- security” and “interpretable artificial intelligence” did not
sified as global (explain the entire model behavior) or local add new results). Aiming to increase the number of results
(explain an individual prediction) [10]. returned, keywords were added referring to specific cybersecu-
Through a Systematic Literature Review (SLR), this work rity topics, so that articles that do not mention “cybersecurity”
seeks to investigate the current research scenario on XAI but that otherwise belong to the area can be included. The
applied to cybersecurity. added keywords are:
The SLR follows 3 well-defined steps: search (query), 1) “detection and response”
analysis (quantitative and qualitative insights), and conclusion 2) “intrusion detection”
(response to the Main Research Question). In the following 3) “intrusion prevention”
sections, each of the SLR phases will be presented, with their 4) “cyber risk”
development and results. 5) “malware”
Thus, the formulated search string is:
II. SLR P HASE 1: S EARCH
(“explainable artificial intelligence”) AND (“cybersecurity”
At this stage of the SLR, we defined the Main Research OR “cyber security” OR “detection and response” OR
Question (MRQ), a set of Secondary Questions (SQ), the “intrusion detection” OR “intrusion prevention” OR “cyber
repositories to be searched, the language of the articles to be risk” OR “malware”)
evaluated, the keywords and search query, and the inclusion
As inclusion criteria for an article returned by the search to
and exclusion criteria for returned articles.
be considered relevant for reading and analysis, the following
In order to investigate the current research scenario on
were established:
XAI techniques applied to cybersecurity, the Main Research
Question we aim to answer is: 1) Is it a primary work, as opposed to other literature
reviews?
What are the XAI techniques used to promote more 2) Does the XAI technique discussed have a cybersecurity
interpretable automated cyber risk classification? domain as its main application?
Table I Table II
PAPERS R ETRIEVED FROM Q UERIES PAPERS PER C OUNTRY
Repository Number of Papers Country Number of Papers
ACM 15 USA 13
IEEE 13 China 3
Scopus 24 India 3
Total 52 Italy 3
Unique and Valid References 42 Germany 2
After Inclusion Criteria 21 South Korea 2
Austria 1
Canada 1
Ireland 1
Since the returned articles were in an accessible quantity, Israel 1
there was no need to establish exclusion criteria. Japan 1
Once all these SLR parameters were defined, the search Mexico 1
Poland 1
itself was carried out: Qatar 1
1) The queries were performed in the established reposito- UAE 1
UK 1
ries
Total 36
2) Redundant articles were filtered
3) Titles and abstracts were read, considering the inclu-
sion/exclusion criteria Table III
4) The remaining articles have been read in their entirety AUTHORS PER C OUNTRY
VI. A PPENDIX
The following appendix contains a more detailed summary
of all the papers reviewed in this work.
Table VIII: Summary of Papers
[24] SHAP, LIME Malware “The explanation will justify and “The performance of an “Convergence is a major issue
(Local Classification support disruptive administrative autoencoder model is with autoencoder design,
Interpretable decisions” and to answer the questions evaluated based on the model’s especially when the dataset
Model-Agnostic “Why did the ML classify a particular ability to recreate the input size is large, and the centroids
Explanations), and pod as a miner? How does the syscalls sequence. Validation of the of the various classes have
an auto-encoding- sequence change from one pod to autoencoder model also significant variance. Also,
based scheme for another? Which feature has the greatest validates the upstream half of when convergence is achieved,
LSTM (Long impact on miner prediction? Is there the classifier model which, in it is often the case that it is at
Short-Term any way to visualize the ML outcome turn, further strengthens the a local minimum of the loss
Memory) models apart from plotting the evaluation trust in the classifier’s function. Such difficulties
metrics?” outcome.” impact the quality of the
explainability method.”
[25] Deep Embedded Malware “Security experts not only do need to “” “”
Neural Network Classification, detect the incoming threat but also need
Expert System Phishing Detection to know the incorporating features that
cause that particular security incident”
and “Adding an explanation feature to
a neural network would enhance its
trustworthiness and reliability.”
[26] Gradient-based Malware “We investigate whether gradient-based “We propose and empirically “”
Explanations Classification attribution methods used to explain validate a few synthetic
classifiers’ decisions provide useful metrics that allow correlating
information about the robustness of the evenness of gradient-based
Android malware detectors against explanations with the classifier
sparse attacks.” robustness to adversarial
attacks.”
[27] Domain Intrusion “The lack of explainability and The authors conduct an The authors stress that “there
Knowledge Detection interpretability of successful AI models Explainability Test whose are some open challenges
Infusion is a key stumbling block when trust in a purpose “is to discover the surrounding explainability and
model’s prediction is critical. This leads comparative advantages or interpretability such as an
to human intervention, which in turn disadvantages of incorporating agreement of what an
results in a delayed response or domain knowledge in the explanation is and to whom, a
decision” experiment” formalism for the explanation,
and quantifying the human
comprehensibility of the
explanation”
Table VIII: Summary of Papers
[35] LIME and Intrusion “Researchers in the field of network “” No case study on real datasets;
Saliency Maps Detection security are the target users of our lack of scalability for larger
visual analytics system. The design goal DL models; limited to CNNs.
of our visual analytics system is to aid
our target users to better interpret the
deep learning model.”
[36] Random Forest DGA “The proposed state-of-the-art “” “”
Classification, classifiers are based on deep learning
Intrusion models. The black box nature of these
Detection makes it difficult to evaluate their
reasoning. The resulting lack of
confidence makes the utilization of such
models impracticable”
[37] SHAP Intrusion “Help us understand how deep learning “” “”
Detection models learn and why they make such
decisions for each input” and also “We
propose a method to improve detection
efficiency by using XAI to reduce the
input data”
[38] Domain Intrusion “The lack of explainability leads to a The authors extend the work “”
Knowledge Detection lack of trust in the model and made in [27], applying to it the
Infusion prediction, which can involve ethical Explainability Quantification
and legal issues in critical domains due Method previously developed
to the potential implications on human by them in [53].
interests, rights, and lives”
[39] DALEX (moDel Intrusion “Apply explainable AI to better “” “”
Agnostic Detection interpret how the model reacts to the
Language for shifting distribution.”
Exploration and
eXplanation)
Table VIII: Summary of Papers
[40] Decision Tree, DGA “Deep learning models have found wide “” “”
Visual Analytics Classification, adoption for many problems. However,
Intrusion their blackbox nature makes it hard to
Detection trust their decisions and to evaluate
their line of reasoning. In the field of
cybersecurity, this lack of trust and
understanding poses a significant
challenge for the utilization of deep
learning models”
[41] LEMNA (Local Malware “Security practitioners are concerned “The fidelity metrics are “”
Explanation Classification, about the lack of transparency of the computed either by directly
Method using Binary Reverse deep learning models and thus hesitated comparing the approximated
Nonlinear Engineering to widely adopt deep learning classifiers detection boundary with the
Approximation) in security and safety-critical areas” real one, or running
end-to-end feature tests”
[42] Adversarial Malware “Researchers should use explanation “” “”
Classification techniques to understand the behavior
of the classifiers and check if the
learned features are fragile features
that can be easily evaded or if they
conflict with expert knowledge”
[43] ANCHOR, LIME, DGA XAI and Open-Source Intelligence can “” “”
SHAP and Classification, together address “trust problems”,
Counterfactual Intrusion serving as “an antidote for skepticism
Explanations Detection to the shared models and preventing
automation bias.”
Table VIII: Summary of Papers
[44] Decision Trees Anomaly They cite other authors stating that “” “First, our architecture should
Detection “explanations ensure the correct be used in datasets with less
behavior of the algorithm” and than a thousand attributes
“machine learning systems would be because it builds a tree for
more widely accepted once they are each attribute. Building
capable of providing satisfactory thousands of trees is
explanations for their decisions.” time-consuming, although this
limitation may be overcome
with access to better
computing resources. Second,
our proposal may fail to build
decision trees for attributes
with tens of different values in
the definition domain.”
[45] Heatmaps Malware “The XAI aims to enable human users “” “Despite their usefulness, the
Classification to develop understanding and trusts to cumulative heatmaps, at the
the model prediction.” and also “The time of writing, do not play a
effectiveness of these [autonomous] role that can be automatized
systems is limited by the current without knowledge on the
inability of machines to explain their dataset and the malware code.
decisions and actions to human users. They help to interpret and
The most important step towards understand the outcomes, but
reliable models is the possibility to they do not provide fixed
understand their prediction i.e., the information that could be used
so-called interpretability.” to any user to evaluate models
without any prior-knowledge
on the architecture or the
problem itself.”
Table VIII: Summary of Papers
[46] TRUST Intrusion “Despite the popularity of AI, it is “” “Due to using information gain
(Transparency Detection limited by its current inability to build in picking the representatives,
Relying Upon trust. Researchers and industrial TRUST might overfit to the
Statistical Theory) leaders have a hard time explaining the training set. This would lead to
decisions that sophisticated AI poor performance on unseen
algorithms come up with because they data. On the other hand, if the
(as AI users) cannot fully understand Gaussian assumption cannot
why and how these “black boxes” make be made or the probability
their decisions.” distribution of data changes,
the output of TRUST would not
be reliable. Also, the
assumption of samples being
drawn independently is very
important.”
[47] LIME Abuse of “To assess the quality of our network “” “”
Privacy-related and to avoid incomprehensible black
Permissions on box predictions, we employ the model
Mobile Apps explaining algorithm LIME”
[48] Deep SHAP Network Traffic “The black-box nature of DL techniques “” “”
Classifier, hides the reason behind specific
Intrusion classification outcomes. This impacts
Detection the understanding of classification
errors and the evaluation of the
resilience against adversarial
manipulation of traffic to impair
identification. Moreover, by
understanding the behavior of the
learned model, performance
enhancements can be pursued with
much more focused and efficient
research, compared with a
less-informed exploration of the
(typically huge) hyper-parameters
space.”
Table VIII: Summary of Papers
http://arxiv.org/ps/2303.01259v1
This figure "fig1.png" is available in "png" format from:
http://arxiv.org/ps/2303.01259v1