Proposal

Implementing a honeypot for IOT Smart Homes to cope with zero-day
attacks using machine learning
CHAPTER 1
Problem Statement (200 words max):
As the number of Internet of Things (IoT) devices in smart homes increases, the risk of zero-
day attacks also increases. Zero-day attacks exploit unknown vulnerabilities in software and
hardware, making them difficult to detect and defend against. IoT devices are often
vulnerable to such attacks due to their limited resources and lack of security updates.
To address this problem, implementing a honeypot for IoT smart homes can be an effective
solution. A honeypot is a security mechanism that creates a decoy system or application to
attract attackers and detect their activities. By implementing a honeypot, it is possible to
identify zero-day attacks and gather information about the attackers' methods and tactics.
Machine learning techniques can be used to develop a more effective honeypot for IoT smart
homes. Machine learning algorithms can analyse network traffic and identify patterns that
indicate an attack. By training the machine learning models on data from previous attacks, the
honeypot can become more accurate and effective in detecting zero-day attacks.
The scope of this research is:

 Design and architecture.
 Data Collection and Analysis.
 Machine learning models.
 Attack Simulation.
 Integration with existing security systems.
Research Questions:
 How effective are honeypots for cyber security for IOT devices in detecting and
mitigating cyber-attacks, and how does machine learning improve its performance?
 What are the design considerations for implementing a honeypot for cyber security?
 What are the limitations and challenges in implementing a honeypot for IOT devices
using machine learning, and how can they be addressed to improve its performance?
 How does the use of honeypots compare to other cyber security approaches, and what
are its advantages and disadvantages in different contexts?
Research Objectives:
 Detect and mitigate zero-day attacks: The honeypot will act as a trap for attackers, and
by analysing the data collected by the honeypot, organizations can identify and
mitigate zero-day attacks.
 Improve machine learning algorithms: The honeypot will use machine learning
algorithms to detect and classify the behaviour of the attackers. The data collected by
the honeypot can be used to improve the machine learning algorithms, making them
more accurate and effective.
 Enhance threat intelligence: By analysing the data collected by the honeypot,
organizations can gain valuable insights into the tactics, techniques, and procedures
used by attackers. This information can be used to enhance threat intelligence,
enabling organizations to better protect their IoT smart homes.
 Provide early warning of attacks: The honeypot can provide early warning of attacks,
allowing organizations to take proactive measures to prevent the attacks from causing
damage.
 Identify vulnerable devices: By analysing the data collected by the honeypot,
organizations can identify vulnerable devices and take measures to patch or replace
them.
CHAPTER 2
Literature Review
In this chapter we will study different papers relevant to the honeypot and machine learning
used in IoT
The Internet of Things (IoT) is everywhere now, but maybe nowhere is it more pervasive than
in the modern smart home. Cyberattacks on smart homes are on the rise due to the
proliferation of IoT devices. Cyber-attacks may be detected and monitored with the use of a
honeypot. This paper intends to critically examine the existing literature on utilising
honeypots in IoT smart homes as a means of mitigating the effects of zero-day attacks by
means of machine learning.
In computer networks, one sort of security mechanism known as a honeypot is utilised to
detect and foil attempts to get into the network. It is a sham version of a network, service, or
piece of software that is used to probe for vulnerabilities in security. Honeypots are designed
to uncover trends in the behaviour of attackers by monitoring for and analysing attacks after
they have occurred. Honeypots are becoming increasingly used as a low-cost technique of
detecting and monitoring cyberattacks on Internet of Things (IoT) smart homes, which has
contributed to their rise in popularity in recent years.
The Internet of Things makes smart homes susceptible to zero-day assaults. An attack that
takes advantage of a flaw in the system's security that has never been discovered previously is
known as a "zero-day attack." It is common for these assaults to go undiscovered until after
they have already caused significant harm. Machine learning has been found to be an
effective way for detecting zero-day attacks, in contrast to the traditional security
methodologies, which struggle to identify these types of threats.
Scope of Research
Review Questions
 What challenges existed in the ML based honeypots?
 How well do IOT honeypots detect and mitigate cyberattacks, and how does machine
learning improve their performance?
 Cybersecurity honeypot design considerations?
 How might machine learning honeypots for IOT devices be improved? What are their
limitations and challenges?
Research Selection Criteria
 Journal articles, conference papers.
 Research published during the period between 2019 and 2022.
 Research must provide the answers to the research questions.
 Research also contains the title, and year.
 Literature targeted the honeypot, honeypot for IOT Smart Homes and machine
learning.
Research Exclusion Criteria

 Source: IEEE, Google Scholar, Hindawi, Mdpi, and Science Direct.
 Search equations: Honeypot, Machine Learning.
Targeted Area
 Honeypot for IOT Smart Homes.
 Honeypot and Machine Learning.
Intelligent-Interaction Honeypot
In order to fix IoT's security issues, this article suggests deploying deception strategies and
honeypots. The goals involve finding flaws, making Internet of Things devices safer, and
fixing them in a cost-effective way. With this approach, attacker sessions can be prolonged
and more IoT network threats can be captured by utilising machine learning to develop
honeypots that fool and engage them. The paper's significance is in showcasing the potential
of machine learning powered IoT honeypots and bringing attention to the need for efficient
and inexpensive approaches to discover security flaws in IoT devices. Additional machine
learning method improvements, testing in real-world IoT setups, monitoring, and updating to
adapt to changing attacker methods, impact assessment on IoT security, and cost-benefit
analysis for economic feasibility in IoT systems are all on the agenda for future work on the
honeypot (Mfogo, 2023).
First, it presents multi-cascaded CNN classification for IIoT network assault detection.
Second, it recommends dynamic honey pot encryption for IoT cloud data transmission and
storage. Accuracy, precision, recall, and F1-score are compared on power, loop sensor, and
land sensor datasets. It compares the proposed method's throughput, latency, and detection
rate against existing approaches. The article compares system encrypting, decrypting, and
running times. A secure, energy efficient IIoT data transfer system uses power, loop sensor,
and land sensor data. Multi-scale grasshopper optimisation refines network model. Dynamic
honey pot encryption saves data in the cloud once the Robust Multi-cascaded CNN (RMC-
CNN) classification system detects an assault. Low-power IIoT data transmission, RMC-
CNN network breach detection, dynamic honeypot encryption for data security, encrypted
IoT cloud storage, and distributed ledger encryption key storage are the paper's contributions.
Simulations show the suggested technique transfers secret data quicker. The report also
emphasises the new system's cost function performance compared to prior systems and
recommends real-time data and precise detection. It encourages IIoT network microservice
behavioural analysis and virtual data environments and real-time data analysis. Building a
strong detection algorithm framework and applying it to real-time data is the identified gap.
Real-time data concerns should be investigated in the virtual data environment. The
recommended technique prioritises fixing real-time data issues and understanding IIoT
network microservice behaviour to fix disparities (Sankaran, 2023).
IoT Smart Home Networks using Machine Learning Methods

The significance of protecting users' data access and privacy; to investigate the role of
machine learning in creating efficient IDS; to identify the risks posed by AML attacks on
ML-based IDS; and to test whether adversarial training using DoS attack tactics increases the
robustness of supervised models. High accuracy was achieved using XGBoost, decision tree,
and AdaBoost classifiers by analysing an IoT situation using AML and adversarial training to
enhance supervised models. This research adds to the field by exploring supervised classifier
adversarial assaults and increasing adversarial attack detection using machine and deep
learning, as well as by demonstrating the sensitivity of machine learning detectors and the
possible impact of AML on IoT networks. In the future, researchers hope to better understand
the impact of adversarial attacks on IoT security and privacy, develop and test advanced
models, use IoT data for training and testing IDS, and put in place preventative measures
(Iqbal, 2022).
A security solution for safeguarding Internet of Things (IoT) devices. This will be achieved
by utilising an ensemble machine learning approach to develop an intrusion detection system.
The efficacy of the proposed solution is assessed through experimentation with diverse
datasets that conform to industry standards. This study makes a significant contribution to the
field by utilising an ensemble machine learning technique to develop and implement an
intrusion detection system. Furthermore, the system's effectiveness is evaluated using
industry-standard datasets to ensure its reliability and accuracy. Prospective research
endeavours encompass broadening the scope of experimentation to encompass a wider range
of Internet of Things (IoT) datasets, assimilating contemporaneous cyber threat models into
IoT devices, acknowledging the intricacy of safeguarding low-energy IoT devices,
investigating alternative security resolutions for confidentiality and reliability, and subjecting
the solution to testing on operational IoT devices to rectify any anomalies (Das, 2022).
The rising concerns over the safety of Internet-connected technology, including embedded
systems, cyber-physical devices, and the Internet of Things (IoT). Among the goals of this
project are the formulation of a workable plan for the defence of the Internet of Things
infrastructure, the analysis of the efficacy of machine learning (ML) algorithms in the
defence of the IoT and its ecosystem, and the investigation of data sets that are pertinent to
the project. The process entails analysing cutting-edge data sets to detect security
vulnerabilities and provide solutions, evaluate the efficacy of machine learning techniques in
managing IoT security, and evaluating the efficiency of ML algorithms. Contributions made
by this study include a discussion of IoT data and crypto ransomware, the highlighting of
clustering challenges for attack detection, the use of a feature-rich dataset (UKM-IDS20), and
the demonstration of the effectiveness of three machine learning algorithms in recognising
and classifying IoT assaults. The work that has to be done in the future includes integrating
found clusters, increasing the dataset to include current threats, advocating the use of a wider
range of Internet of Things and operational technology protocols, and proposing the usage of
STIX for structured threat data sharing (Ariffin, 2022).
Honeypot Architecture for Detection of Zero-Day Attacks in IoT

The process of locating security flaws in interconnected devices, researching criminal activity
that targets IoT devices, and assuaging concerns over the safety of these devices. Honeypots
are used in this technique, which comprises the creation of a strategy for finding security
holes in Internet of Things devices and following down attackers in order to learn about their
tactics. The research introduces two honeypot systems: IoTCandyJar, which has enhanced
capabilities for data analysis, and IoTZeroJar, which is a proposed system for detecting and
investigating zero-day assaults. Both of these honeypot systems have better speed. The tasks
that lie ahead include the implementation and testing of the filtering components of
IoTZeroJar, the integration of real-time capture modules with machine learning to identify
suspicious requests and ensuring that detection and analysis methods are able to keep up with
the ever-evolving tactics of attacks on IoT networks (Ellouh, 2022).
Locating potential threats through the utilisation of honeypots. It suggests the utilisation of
watermarked learning models and makes use of machine learning techniques in order to
achieve this objective. The approach that was used in the research included producing a
private key that was then included in the watermark and utilising the Threat Model as part of
the framework that was provided by the HoneyModel. The primary innovations made by this
study are the creation of machine learning honeypots called HoneyModel, which are able to
identify the adversarial utilisation of machine learning models, the synthesis of embedded
watermark keys inside the models, and the utilisation of neural networks for the training of
models. The research offers additional investigation of advanced neural network
methodology as a potential area for future work or gaps (Abdou, 2021).
A conceptual architecture for the protection of networks. Second, it suggests the development
of a machine learning (ML) model and honeypot system to improve the overall level of
security in a number of different types of companies. The approach calls for the establishment
of three honeypots with little interaction within a private network in addition to the
application of Support Vector Machine (SVM) algorithms for the development of an
enhanced artificial intelligence (AI) system. The purpose of this article is to offer a security
solution that is applicable to enterprises and can be utilised by them to secure their data and
defend themselves against cyberattacks. In terms of work that will be done in the future, the
authors intend to investigate the possibility of utilising high-level interaction honeypots in
conjunction with reinforcement learning (RL) in order to further improve security measures
(Tsochev, 2021).
An AntiConcealer edge IoT framework that makes use of edge artificial intelligence for the
purpose of determining whether an attacker is concealing their activity in the Internet of
Things (IoT). Honeypots are integrated with servers as part of the architecture in order to
assess how successful and accurate the process of recognising attacker behavioural patterns
is. The approach entails employing a Multivariate Hawkes Process to develop an adversarial
behaviour model and then applying it to discover concealed behaviours using BPGM. This is
done in order to accomplish the task. After that, the hidden behaviours that have been
categorised are employed in a non-negative weighted impact matrix, and the Decision Tree is
used to evaluate the findings of the matrix. The invention of AntiConcealer, an edge-assisted
Internet of Things framework, is the main contribution of this study. AntiConcealer identifies
and inhibits harmful activity in the Internet of Things by using an AI technique to identify
disguised behaviours. The potential application of reinforcement learning to further increase
the capabilities of the framework is indicated in the study as the future work that needs to be
done or the future gap that needs to be filled (Zhang, 2021).
Rule-based content by putting out a solution based on artificial intelligence. Also presented is
a method for the identification and evaluation of attacks that makes use of machine learning
techniques such as LightGBM, Random Forest, and K-NN. The implementation of the
system and the analysis of assaults are both handled by the Cowrie Honeypot. The authors of
this research built an artificial intelligence-based threat identification and detection system in
order to get a deeper comprehension of attacks made against the Cowrie Honeypot. In their
future work, the authors suggest conducting more research into reinforcement learning as a
potential method for making the system better. In conclusion, the work presented here offers
a system that is based on machine learning for detecting and evaluating cyber hazards, as well
as an AI-based solution for rule-based content, with the potential for future enhancement
through reinforcement learning (BO-XIANG WANG, 2022).
Machine Learning Approaches

Develop an SD-Honeypot network by putting in place a Honeypot Sensor (Suricata) in a
software-defined networking environment. By integrating an Intrusion Prevention System
(IPS) application into software-defined networking (SDN), it solves the availability problem
in network security, in particular the problem caused by Distributed Denial of Service
(DDoS) assaults. In order to identify DDoS attacks, the methodology involves using a single
application on a Latest Honeypot Network (LHN) Server in conjunction with various
machine learning techniques. These techniques include Support Vector Machines (SVM),
Multi-Layer Perceptron (MLP), Classification and Regression Trees (CART), Gaussian Naive
Bayes (GNB), and K-Nearest Neighbours (KNN). The contributions of the study include a
smart SDN that provides DDoS-IPS capabilities and an algorithm called CART that enables
rapid reaction owing to the network's simplicity. Both of these contributions are discussed in
the paper. The use of reinforcement learning to improve the system in the future is the future
work that needs to be done or the gap that needs to be filled (Sumadi, 2022).
Deploying ultra-dense networks with wireless honeypots (WHs) and developing a tactical
deployment approach utilising Reinforcement Learning (RL) technique such as Q-Learning
and e-Greedy are both things that will be done in this project. The report also explores several
kinds of honeypots that provide the highest possible level of security. The approach that was
used in the study included the use of RL agents in order to establish the suitable number of
WHs for protecting access points. The purpose of this study is to investigate the use of WHs
in very dense networks, propose a strategic deployment strategy that makes use of two
different RL algorithms, and attempt to find the ideal number of WHs. These are the
contributions that the paper makes. In the suggested future study, more sophisticated RL
approaches for using WHs in B5G, 5G Core, and 5G-RAN networks are going to be
researched and investigated (Radoglou-Grammatikis, 2022).
The utilisation of a machine learning (ML) augmented honeypot with the purpose of lowering
the amount of labour expenses connected with data processing. An emphasis on automatically
assessing and cleaning the data in order to improve the effectiveness of the detection. This
investigation makes use of a technique that is based on a honeypot system architecture that is
composed of four modules. These modules are called Request Handle, Base, Payload
Detection, and Logger. In order to recognise and make a record of the contents of requests,
the Payload Detection Component makes use of machine learning methods, more particularly
a regret value ensemble. The design and implementation of a machine learning enhanced
honeypot system, the automation of data assessment using ML, and the use of a regret value
ensemble strategy to increase detection performance are the contributions that this work
makes. In terms of further work, the authors aim to investigate reinforcement learning as a
potential method of further improving the system that has been suggested (Jiang, 2020).
The challenge of detecting social spammers and protecting safe social media platforms from
being subjected to social assaults. The technique that has been suggested entails developing a
system that is based on BLS (Bilateral Supervision Learning). This system is known as SSL
(Semi Supervised Learning). The essential data for training the model is collected from a
real-world Twitter dataset using a small quantity of labelled data and a big number of
unlabeled data. The model is then trained using this information. For the purpose of
assessment, the output of the proposed model, which goes by the name ASSD (Adaptable
Social Spammer Detection), is compared with the output of existing supervised and semi-
supervised machine learning models. The construction of a completely customizable SSD
system that includes both a BLS and an SSL board is the contribution that this work makes to
the field. In the work that will be done in the future, the primary objective should be to
decrease the amount of time spent in training while simultaneously increasing the accuracy of
spammer identification and optimising the computational complexity of the system (Qiu,
2020).
PHG into Software-Defined Networking (SDN), also known as the Probabilistic Honeypot
Game. Second, it solves the problem of Distributed Denial of Service (DDoS) assaults in the
setting of the Industrial Internet of Things (IIoT). Thirdly, it investigates the possibility of
demonstrating the presence of several Bayesian Nash Equilibrium groups in the PHG. Lastly,
it intends to make use of PHG techniques in order to effectively manage harmful assaults.
The approach that is utilised in this research comprises combining PHG tactics into SDN.
More specifically, the emphasis is placed on the Defender's Optimal Strategy, and an Optimal
Strategy Analysis is carried out utilising PHG. Anti-honeypot assaults are also covered in this
work, with an emphasis placed on the benefits these attacks present to attackers and how
PHG methods may be used to analyse the interactions that are involved. The study makes a
number of contributions, some of which are the following: the resolution of DDoS assaults in
IIoT scenario; the introduction of PHG technique into SDN; and the detection of honeypots
by attackers for the purpose of exploitation. In the future, research in this field will focus on
improving the precision of PHG and putting it to use in applications that are relevant to the
real world (Wang, 2020).
Honeypot-as-a-Service for Smart Home Solutions

Regarding the collaboration in the field of cybersecurity and the development of partnerships
between the European Union (EU) and the Association of Southeast Asian Nations (ASEAN).
Increasing cooperation, finding a national solution, boosting levels of cyber preparation,
lowering cyber risk, and strengthening cybersecurity supervision are some of the goals.
Constructing the YAKSHA architecture and researching its applicability in an Internet of
Things (IoT) setting are the two steps that make up the approach that the study employs.
Contributions made by this article include enhancing EU-ASEAN collaboration, presenting
the YAKSHA framework, documenting its integration into an Internet of Things deployment,
and discussing its application in non-commercial Internet of Things applications. The work
that has to be done in the future or the gap that has been found entails further automating and
testing the YAKSHA system in a variety of Internet of Things situations in order to increase
its efficiency and scalability (Kostopoulos, 2020).
Intrusion Detection Systems
To conduct research on a number of different assaults on cloud computing security, such as
cloud Wrapping, Browser Malware-Injection, and Flooding threats. The strategy that is
implemented entails locating the origin of the attack as well as the attack itself, transmitting
the information to a honeypot, and then analysing the information in order to thwart the
attacker. The purpose of this article is to provide a complete assessment of the many attacks
that can occur in cloud computing. This review is the paper's contribution. The need to
investigate machine learning approaches that automate honeypots for the purpose of
obtaining information from attackers has been noted as a future task or research gap. This
will lead to more accurate outcomes (Devi, 2020).
Honeypot and Machine Learning

To develop a honeypot and an architecture based on machine learning for the purpose of
detecting malicious software. It does this by implementing Decision Tree techniques and
Support Vector Machine (SVM) in order to improve accuracy by raising detection rates and
lowering the number of false alarms generated. Honeypots are used to collect traffic packets,
which are then stored on a dummy server before being analysed. The process comprises
utilising honeypots to gather traffic packets. A dataset of 900,000 records is used to train a
model using the Support Vector Machine (SVM) method and the Decision Tree. This allows
for accurate results to be obtained. The contribution of the study is a system that detects
malware by combining machine learning with honeypots. The system makes use of SVM and
Decision Tree methods in order to increase its overall performance. In the study, it is
suggested that future work or gaps should be filled by investigating unsupervised machine
learning in order to achieve comparable aims in the future (Matin, 2019).
Honeypot-based approaches and machine learning (ML) are utilised in this malware detection
solution for Internet of Things platforms. The primary goals are to design a system that can
identify Distributed Denial of Service (DDoS) assaults and Zero-Day vulnerabilities and
provide protection against them. Setting up an Internet of Things virtual honeypot to collect
log files and then using machine learning models for DDoS attack detection, such as KNN
and random forest, are the steps involved in the process. The creation of a DDoS detection
system that makes use of honeypot and machine learning techniques is the contribution that
this work makes. In next work, we will investigate unsupervised machine learning as a means
of improving honeypot capabilities and expanding the system so that it can identify more
kinds of assaults (Vishwakarma, 2019).
The development of a honeypot system that may detect possible assaults by automatically
scanning network traffic or log files. Second, it presents a whole new automated
identification model that is able to differentiate between regular servers and honeypots. This
model uses three group characteristics that were obtained from a random forest technique as
its foundation. Extraction of features, computation of the features that were extracted, and
collecting of data and labelling of the data are the three components that make up the
methodology of the study. The contributions of this work include establishing that honeypots
are capable of simulating actual systems and giving a reference point for further developing
honeypot technology. Both of these are important aspects. In terms of work that will be done
in the future, the authors propose looking at reinforcement learning as a possible strategy
(Huang, 2019).
Honeypot for IoT Devices Through Reinforcement Learning

HoneyIoT was developed as an adaptive high-interaction honeypot to address the
vulnerabilities and threats that exist within IoT-based systems. The study's main contribution
lies in the creation of a honeypot system that is capable of gathering attack data, exposing
attacker techniques and tactics, circumventing pre-attack verifications, and successfully
deceiving attackers. The employed methodology entails the construction of a tangible
mechanism for gathering attack traces, the formulation of attack behaviour models through
markov decision process, the utilisation of reinforcement learning methods to determine
optimal responses, and the application of differential analysis techniques to produce
responses of high accuracy. Potential future work may entail augmenting the functionalities
of HoneyIoT, investigating its efficacy against advancing attack methodologies, and
expanding its implementation and assessment on a broader scope (Anon., 2023).
To evaluate the effectiveness of a proposed deep learning-based approach to synthetic IoT

traffic generation in generating traffic flows that mimic real network traffic due to user and
IoT device interactions; to address the limitations of existing IoT honeypots in generating
realistic network traffic flows; to address these limitations. The study's original contribution
is a revolutionary deep learning-based method for simulating realistic traffic flows in IoT
networks. The research emphasises the significance of simulating genuine network activity in
IoT honeypots for the purposes of cyber deception and strengthening security measures. By
combining domain-specific information shared by IoT devices with a basic generative
adversarial learning method for sequences, the suggested technique is able to overcome the
barrier of a lack of device-specific IoT traffic data. The research in this article employs a deep
learning-based methodology to create artificial Internet of Things (IoT) traffic flows. To
evaluate the efficacy of the suggested method, the research involves a comprehensive
experimental assessment with 18 IoT devices. The study pits the synthetic IoT traffic
generator against the best sequence and packet generators available today, revealing the tool's
uncanny ability to fool even a dynamic attacker. Additional domain-specific information or
the investigation of other cutting-edge machine learning approaches might be included into
the suggested deep learning-based methodology to improve it in future work. The research
might be expanded to examine how well the synthetic IoT traffic creation tool performs in
real-world settings and against more sophisticated adversaries. The suggested method's
scalability and applicability to a broader set of IoT devices and network settings might
potentially benefit from more study (Joseph Bao, 2023).
Honeypot for IoT Protocols based on Android

The aim is to create a honeypot for IoT protocols utilising the advanced HosTaGe honeypot
framework, which is tailored for IoT communication protocols operating on public networks.
The objective of this paper is to examine the insufficiency of security integration in the
development of Internet of Things (IoT) and the susceptibility of IoT systems to cyber-
attacks. The research suggests the utilisation of honeypots, which function as susceptible
components or systems, for the purpose of collecting information on potential assailants
within the decentralised framework of the Internet of Things. The study's methodology entails
the deployment of MQTT, CoAP, and AMQP protocols on a simulated mobile honeypot
utilising Android device. The Telnet and SSH protocols that are currently utilised in IoT
systems have been enhanced to function on the honeypot. The scholars have documented and
assessed genuine public assaults on said protocols and established a platform for engaging
with the deployed honeypot. Potential future research endeavours may encompass the
extension of the range of IoT network protocols that are supported, the augmentation of the
honeypot's capacities for the identification and alleviation of attacks, and the incorporation of
the discoveries into IoT security methodologies to heighten the robustness of IoT systems
against cyber-attacks (Irini Lygerou, 2022).
Hybrid IoT/OT Honeypots
A comprehensive assessment of deception strategies such as honeypots in the realm of
Internet of Things (IoT) and operational technology (OT) protocols. Additionally, the study
endeavours to surmount the constraints of earlier honeypots that were susceptible to
fingerprinting attacks. The study's contribution encompasses the expansion and assessment of
RIoTPot, a hybrid-interaction honeypot, over a period of three months through a longitudinal
investigation. This research entails subjecting RIoTPot to internet-based attacks and assessing
its performance based on various parameters. The investigators have additionally furnished a
dataset pertaining to the investigation, which is available for perusal by other scholars upon
solicitation. The study's methodology entails the utilisation of RIoTPot across three
interaction variants and six protocols, implemented on both cloud-based and self-hosted
infrastructure. The study authors gathered and examined a dataset of 10.87 million instances
of cyber-attacks that were initiated from 22,518 distinct IP addresses. The attacks
encompassed a range of techniques, including brute-force, poisoning, multistage, and other
forms of attack. The researchers additionally performed IP address fingerprinting to ascertain
the devices utilised by the attacker in the course of the attacks. The findings suggest that the
degree of engagement with honeypots is a critical factor in the attraction of targeted attacks
and scanning attempts. Prospective research in this domain may encompass the advancement
and optimisation of the hybrid-interaction honeypot, examination of supplementary attack
vectors, and formulation of countermeasures predicated on the discernments obtained from
the investigation (Srinivasa, 2022).
Honeypots are a useful method for tracking and identifying cyber-attacks in IoT-enabled
dwellings. These days, one of the best ways to spot zero-day threats is with the use of
machine learning algorithms. To counteract zero-day assaults using machine learning,
researchers have studied the use of honeypots in Internet of Things (IoT) smart homes. These
technologies have been proven to be efficient in identifying and protecting against zero-day
threats in IoT smart homes. However, further study is required to assess these systems'
efficacy in practise and to deal with the difficulties of integrating them with preexisting IoT
gadgets.
15
CHAPTER 3
Methodology
Introduction
The methodology chapter is very necessary in order to properly construct a honeypot architecture
that can recognise and neutralise threats in a smart home environment. This chapter provides a
step-by-step guide for establishing a honeypot environment and implementing machine learning
strategies for monitoring, identifying, and neutralising potential security risks.
Figure 1: Proposed Model
Designing the honeypot infrastructure

Establishing a honeypot requires, as the first step, the creation of the infrastructure that will
support it. The current task at hand comprises the identification of the essential components,
including hardware, software, and network, that are required to create a genuine smart home
environment. Honeypot infrastructure can be implemented by making use of either physical or
virtual devices, depending on the resources that are at one's disposal at the time of deployment.
A physical honeypot is a term that is used in the context of cybersecurity to refer to a controlled
environment in which tangible gadgets, such as smart home hubs, cameras, thermostats, and
other Internet of Things (IoT) devices, are installed. Examples of these types of devices include
16
smart homes. The aforementioned devices are connected to a network that simulates the default
settings of a typical home network configuration. The goods described above include networking
equipment such as routers and switches among other similar devices.
The creating of virtual representations of connected home appliances and networks may be
accomplished by virtual honeypots through the use of virtualization technologies like
hypervisors and containers. Because this method allows for several instances to be run on a
single piece of hardware, it provides more adaptability and expandability, which are both
benefits of putting this strategy into practise.
In our case we will create a virtual environment due to the availability of resources.
Honeypots are a sort of cybersecurity method that entails setting up a trap or a phoney system in
order to attract and monitor potentially dangerous actors, such as hackers or cybercriminals.
Honeypots may also be used to monitor legitimate traffic (Lutkevich, 2021). Honeypots are also
sometimes referred to by the name honey baskets. Honeypots serve two purposes: first, they
collect intelligence about an adversary's plans, techniques, and tools; second, they divert an
adversary's attention away from more vital systems that are really being watched. Honeypots are
employed for both purposes. Honeypots are created to give off the illusion of being actual targets
or weak systems in order to encourage potential attackers into dealing with them. This is done so
that the honeypot may lure them in. It is able to simulate a wide variety of systems, including
web servers, databases, and network devices, amongst other types of systems, depending on the
specific objectives of the cybersecurity team. It is possible to set up the honeypot either on the
internal network of an organisation, in which case it would be referred to as an internal honeypot,
or on a separate network segment, in which case it would be referred to as an external honeypot
(Kumar, 2023).
There are two main types of honeypots: low-interaction and high-interaction honeypots:
 Honeypots with low levels of engagement mimic just a small subset of services or
protocols, giving attackers a more constrained environment in which to interact with the
honeypot. They are easier to deploy and keep up to date, but the information they give
about an attacker's behaviour is less specific (Lakhwani, 2022).
17
 Honeypots with a high engagement rate offer an atmosphere that is truer to life by
imitating a broad variety of services and making it possible for attackers to engage in
prolonged conversation. They are able to catch more complex attack strategies and give
more in-depth insights on the behaviours of the attacker, but they take more resources to
implement and manage (Lakhwani, 2022).
Honeypots are capable of gathering a wide variety of data, such as network traffic, system logs,
and even the activities of potential attackers. By conducting an analysis of this data, specialists in
the field of cybersecurity have the opportunity to gather significant insights into the tactics and
motives of attackers, as well as their strategies and possible holes in the organization's defences.
This knowledge may be put to use to strengthen security measures, construct more robust
defences, and expand skills for responding to incidents (Guan, 2023).
However, there is a possibility of harm coming from the use of honeypots. Honeypots might be
used by attackers as a springboard to launch attacks on other systems in the network if they are
not adequately separated from the rest of the network. In order to reduce the potential for harm
caused by these threats, it is essential to put in place stringent security precautions such as
network segmentation and isolation. Honeypots are a useful tool in the field of cybersecurity
because they provide a proactive approach to the collecting of threat intelligence and enable
organisations to keep one step ahead of possible attackers. In general, honeypots are a good tool.
Selecting the Machine Learning Algorithms

Techniques based on machine learning that can recognise and classify harmful behaviour. The
capacity of machine learning algorithms to comprehend patterns and recognise anomalies in data
makes them useful for the detection of hostile behaviour. There are many different approaches to
machine learning that may be utilised for applications like these. Neural networks, random
forests, and decision trees are examples of popular choices. When determining which algorithms
to utilise, it is important to consider a variety of factors, including the amount of available
processing power, the complexity of the data, and any accuracy requirements.
18
Machine Learning:
The goal of the subfield of computer science and artificial intelligence known as machine
learning is to simulate the way in which humans acquire knowledge while simultaneously
improving the accuracy of this process via the use of various algorithms and large amounts of
data. The accomplishment of this goal can be accomplished by patterning machine learning after
the way in which humans learn, therefore imitating human learning processes (Education, 2020).
The discipline of data science is going through a period of considerable expansion, and one of
the most important factors contributing to this development is machine learning. In the process
of data mining, statistical techniques are used to educate computers to generate categorizations or
prognostications and to unearth key discoveries. Following the acquisition of these insights,
actions inside applications and companies are subsequently changed, which has the ability to
effect substantial growth key performance indicators (KPIs). Because big data is always
growing, there will be a greater need for data scientists in the labour market in the near future. It
will be vital to offer aid in determining the most critical business issues and the data that is
required to properly address those queries. This will be a crucial step (Priyadharshini, 2020).
Types of Machine Learning
The two primary subfields that make up the field of machine learning are supervised learning
and unsupervised learning. Every one of them contributes in their own unique way, carries out
their own unique jobs, delivers their own unique outputs, and relies on their own unique
assortment of data. Around 70 percent of machine learning may be attributed to supervised
learning, whereas only 10 to 20 percent can be attributed to unsupervised learning. According to
(Priyadharshini, 2020), the gaps are closed via reinforcement learning (Menon, 2021). The
honeypot strategy has been given a new lease on life because of the widespread availability of
machine learning (ML) libraries. The application of machine learning techniques has been very
helpful for proactive functionality and retrospective analysis. In both of these domains, there is a
large variety of applications that may be made use of machine learning. Because the algorithm
may be "taught" by making use of the data that has already been obtained, supervised learning is
an excellent choice for the process of looking backwards (Alan, 2023). Following the collection
of this data, fresh events are categorised based on their characteristics. There are many different
kinds of classifiers, such as linear classifiers, Naive Bayes classifiers, Support Vector Machines
19
(SVM), Decision Trees, and Random Forests. There are a great many additional descriptors to
consider. Using supervised and unsupervised learning, the data might be structured in a variety of
different ways. When it comes to selecting how to categorise data into categories or form links
between entities, algorithms are given the ability to make such decisions on their own (Dowling,
2020).
Supervised Learning
Supervised learning involves the utilisation of labelled training data. Supervised learning is a
type of learning where the data is already known and the learning process is guided towards
achieving success. The input data is employed for the purpose of training the Machine Learning
model. After the completion of the training process, it is possible to input unfamiliar data and
obtain a novel outcome (Education, 2020) (IBM, 2020).
Unsupervised Learning
Unsupervised learning is a subfield of machine learning that makes use of data for training
purposes that has not been pre-labeled or classified by an experienced professional. The quality
of an algorithm that allows it to function independently of the effect of input data and in the
absence of previous data is referred to as "unsupervised" in the field of computer science. These
pieces of data are then introduced into an algorithm for machine learning as part of the process of
"training" a model. After it has finished its training, a model will be able to engage in active
pattern recognition and respond appropriately (Priyadharshini, 2020).
Different ML Models
Support Vector Machine

A problem with classification or regression that can be handled with the Support Vector Machine
(SVM). However, the vast majority of its applications involve overcoming challenges associated
with categorization. When the SVM technique is used, the value of each feature is represented as
the value of a specific coordinate, and each data point is treated as a point in n-dimensional
space. Additionally, the value of each feature is modelled as the value of a specific coordinate.
Because of this, the technique can classify the data with a greater degree of precision. Finding
20
the hyperplane that creates the clearest division between the two different sets of data is a
necessary step in the classification process (Jon, 2021).
Random Forest
The trees were crafted with two different kinds of chance in mind when they were built. The
construction of each tree begins with the selection of information at random from the
comprehensive collection. We use a random selection process to select some of the
characteristics present at each node in order to produce the most optimal split. Previous research
has demonstrated that Random Forest is the most accurate classification method now in use,
compared to the other methods that are currently in use. This is due to the fact that Random
Forest is capable of estimating crucial classification criteria after applying a big collection of
data (IBM, 2022).
Decision Tree
The Decision Tree is both the strategy that is employed the most frequently and the one that is
the most successful when it comes to the classification of data. In a decision tree, which is very
similar to a flowchart, each internal node represents a test that was run on a feature, each branch
indicates the result of the test, and each leaf node, also referred to as a terminal node, marks the
end of the tree.
The Decision Tree is an all-purpose, predictive modelling tool that has applications in a range of
different areas and topics of study. It may be thought of as a tree with nodes representing
different categories of information. An algorithm is often responsible for the construction of
decision trees. This algorithm seeks various ways to segment a dataset depending on certain
preset criteria. The algorithm known as the decision tree is an illustration of a suitable algorithm.
Because of how beneficial it is, this type of supervised learning is utilised rather frequently. In
the context of problems involving classification and regression, decision trees may be utilised as
a method of non-parametric supervised learning. The objective here is to extract straightforward
decision rules from the data so that the model can make more accurate projections regarding the
value of the variable that is the focus of our attention (IBM, 2022).
21
Decision trees consider entropy and information gain as its major criteria when creating trees
beginning at the root node. The entropy of a sample can be utilised as a measuring stick in order
to ascertain whether or not the sample is homogenous.
Entropy = -∑ jPjlog2(Pj)
Decision trees also make use of the Gini index, which quantifies the degree of uncertainty that
exists inside a single node, in order to reduce the likelihood of an incorrect classification being
made.
Gini Index = 1- ∑ j P j2
Data Collection
In this we collect data through different resources like UCI, Kaggle, IEEE Data port etc. The
process of methodically accumulating information for the purposes of analysis and decision-
making from a broad variety of locations and individuals is referred to as "data collection," and
the phrase "data collection" is used to characterise this act. It involves compiling quantitative and
qualitative descriptions of the world, utilising methods such as questionnaires, interviews, and
computerised systems, among other sources of information. Primary sources are data acquired
directly from the population of interest, whereas secondary sources are data that have previously
been compiled by professionals in the subject. Primary sources are more reliable than secondary
ones. In order to perceive tendencies, patterns, and insights that allow for informed actions and
judgements, appropriate data collecting guarantees the availability of information that is precise,
full, and reliable.
Training the ML Model

It is necessary to train the machine learning models. It is necessary to provide the models with
data that has been annotated to demonstrate both beneficial and detrimental behaviours. With the
assistance of the annotated data, the models are able to learn how to differentiate between normal
conduct and abnormal behaviour.
Training machine learning algorithms involves providing them with data, after which the
resulting models will automatically adjust their parameters in order to attain the highest possible
22
level of performance. The models will continue to be improved in this manner until they are able
to accurately detect attempts at infiltration.
Proposed ML Methodology
Considering this matter, we are adopting an all-encompassing approach. Our method makes use
of supervised learning and a regression model to forecast criminal profiles. The sciences of
artificial intelligence and machine learning both have a subject of research known as supervised
learning. It is unique in that it utilises labelled datasets to train algorithms for identification and
prediction, making it stand out from other similar approaches. In order to apply these models, we
followed the strategy described above.
Figure 2: Proposed ML Methodology
In order for us to be able to recognise possibly harmful profiles, we are going to make use of an
automated machine learning model. The software application known as Wireshark may be
utilised in its capacity as a network protocol analyzer to collect data packets from a network such
as the one that connects computers to the internet. In addition to that, you may use this
application to investigate the information contained within the data packets. Our group intends to
23
make use of a dataset that will be compiled by Kaggle/UCI in the near future. RapidMiner is a
data science platform that was built specifically for companies in order to allow those businesses
to research the ways in which the personnel, knowledge, and data of a company interact with one
another in order to influence decisions. We are going to make use of RapidMiner, which was
developed especially for businesses in order to allow those businesses to explore the ways in
which such interactions may affect decisions. RapidMiner is a company that offers a wide range
of services related to data mining and machine learning. Some of these services include data
loading and transformation (also known as ETL), data preparation and visualisation, predictive
analytics and statistical modelling, evaluation, and deployment. In addition to such services, we
also provide data preparation and visualisation. During the creation of RapidMiner, the
programming language that was employed most frequently was Java. RapidMiner should be used
to import the dataset that may be located in the nearby repository. Following this, the dataset
should be cleaned, and any values that are absent should be removed before the data is
preprocessed. After that, choosing the target column for the training model is the next step. After
that, divide the information into two unique groups, namely the data pertaining to the training
and the data pertaining to the assessment. After the model has been put through its training, go
on to conducting the analysis on it.
The information is then analysed using the machine learning models that have been trained on
the information. The models can identify suspicious behaviour by comparing it to the patterns
they were taught to look for. This study is useful for spotting and comprehending attacks that
have never been seen before.
The honeypot design in the context of a smart home is described in further detail in the chapter
devoted to the methodology. This method employs algorithms for machine learning, the
collecting of data, and in-depth analysis in an effort to enhance the identification and mitigation
of potential risks, hence safeguarding the safety and dependability of the smart home
infrastructure.
24
References
Abdou, A., 2021. HoneyModels: Machine Learning Honeypots. Special Topics in Military
Communications, p. 6.
Alan, 2023. What is Machine Learning? Defination, Types, Applications, and more (. [Online]
Available at: https://www.mygreatlearning.com/blog/what-is-machine-learning/
[Accessed 2023].
Anon., 2023. HoneyIoT: Adaptive High-Interaction Honeypot for IoT Devices Through
Reinforcement Learning. Chongqi Guan, p. 11.
Ariffin, T. A. M. T., 2022. IoT attacks and mitigation plan: A preliminary study with Machine
Learning Algorithms. p. 6.
BO-XIANG WANG, J.-L. C., 2022. An AI-Powered Network Threat Detection System. Issue
May 17, 2022, p. 9.
BO-XIANG WANG, J.-L. C., 2022. An AI-Powered Network Threat Detection System. IEEE,
Issue May 25, 2022, p. 9.
Das, R. R., 2022. Securing IoT devices using Ensemble Machine Learning in Smart Home
Management System. p. 8.
Devi, B. T., 2020. An Appraisal over Intrusion Detection Systems in Cloud Computing Security
Attacks. Innovative Mechanisms for Industry Applications (ICIMIA 2020), p. 6.
Dowling, S., 2020. A new framework for adaptive and agile honeypots. A new framework for
adaptive and agile honeypots, p. 180.
Education, I. C., 2020. What is machine learning. [Online]

Available at: https://www.ibm.com/cloud/learn/machine-learning
25
Ellouh, M., 2022. IoTZeroJar: Towards a Honeypot Architecture for Detection of Zero-Day
Attacks in IoT. p. 7.
Guan, C. e. a., 2023. HoneyIoT: Adaptive high-interaction honeypot for IoT devices through
reinforcement learning,. [Online]
Available at: http://arxiv.org/abs/2305.06430
Huang, C., 2019. Automatic Identification of Honeypot Server Using Machine Learning
Techniques. Security and Communication Networks, Volume 9, p. 9.
IBM, 2020. What is Supervised Learning?. [Online]

Available at: https://www.ibm.com/cloud/learn/supervised-learning#:~:text=Supervised
%20learning%2C%20also%20known%20as,data%20or%20predict%20outcomes%20accurately.
IBM, 2022. What is a decision tree. [Online]

Available at: https://www.ibm.com/topics/decision-trees
[Accessed 2023].
IBM, 2022. What is random forest?. [Online]

Available at: https://www.ibm.com/topics/random-forest
[Accessed 2023].
Iqbal, Z., 2022. Denial of Service (DoS) Defences against Adversarial Attacks in IoT Smart
Home Networks using Machine Learning Methods. NUST Journal of Engineering Sciences,
Volume Vol. 15, No. 1, p. 8.
Irini Lygerou, S. S. E. V. G. S. &. D. G., 2022. A decentralized honeypot for IoT Protocols based
on Android devices. International Journal of Information Security volume, p. 21.
Jiang, K., 2020. Design and Implementation of A Machine Learning Enhanced Web Honeypot
System. 2020 13th International Congress on Image and Signal Processing, BioMedical
Engineering and Informatics (CISP-BMEI), p. 5.
Jon, Y., 2021. Support Vector Machine (SVM) Algorithm. [Online]

Available at: https://www.javatpoint.com/machine-learning-support-vector-machine-algorithm
[Accessed 2023].
26
Joseph Bao, M. K. Y. V. C. K., 2023. IoTFlowGenerator: Crafting Synthetic IoT Device Traffic
Flows for Cyber Deception. Cryptography and Security (cs.CR); Machine Learning , p. 13.
Kostopoulos, A., 2020. Realising Honeypot-as-a-Service for Smart Home Solutions. p. 6.
Kumar, G. e. a., 2023. Malware Identification using a set of Transparent Honeypot in

Cyberspace. 2023 3rd International Conference on Innovative Practices in Technology and
Management , p. 7.
Lakhwani, S., 2022. What is a honeypot? Types, benefits, risks and best practices. [Online]
Available at: https://www.knowledgehut.com/blog/security/honeypot
[Accessed 2023].
Lutkevich, B. C. C. a. C. M., 2021. What is a honeypot?. [Online]

Available at: https://www.techtarget.com/searchsecurity/definition/honey-pot
[Accessed 2021].
Matin, I. M. M., 2019. Malware Detection Using Honeypot and Machine Learning. p. 4.
Menon, K., 2021. An introduction to the types of machine learning,. [Online]

Available at: ttps://www.simplilearn.com/tutorials/machine-learning-tutorial/types-of-machine-
learning
[Accessed 2023].
Mfogo, V. S., 2023. AIIPot: Adaptive Intelligent-Interaction Honeypot for IoT Devices. p. 7.
Priyadharshini, 2020. What is Machine Learning and types of Machine Learning. [Online]
Available at: https://www.simplilearn.com/tutorials/machine-learning-tutorial/what-is-machine-
learning#what_are_the_different_types_of_machine_learning
Qiu, T., 2020. An Adaptive Social Spammer Detection Model with Semi-supervised Broad
Learning. p. 14.
Radoglou-Grammatikis, P., 2022. Strategic Honeypot Deployment in Ultra-Dense Beyond 5G

Networks: A Reinforcement Learning Approach. IEEE, p. 12.
Ray, S., 2017. analyticsvidhya.com. [Online]

Available at: https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-
27
machine-example-code/
[Accessed Sunday October 2022].
Sankaran, K. S., 2023. Deep learning-based energy efficient optimal RMC-CNN model for
secured data transmission and anomaly detection in industrial IOT. Sustainable Energy
Technologies and Assessments , Issue 4 January 2023, p. 8.
Srinivasa, S., 2022. Interaction matters: a comprehensive analysis and a dataset of hybrid IoT/OT
honeypots. p. 14.
Sumadi, F. D. S., 2022. SD-Honeypot Integration for Mitigating DDoS Attack Using Machine
Learning Approaches. INTERNATIONAL JOURNAL ON INFORMATICS VISUALIZATION,
Issue March 2022, p. 6.
Tsochev, G., 2021. Using Machine Learning Reacted with Honeypot Systems for Securing
Network. International Conference AUTOMATICS AND INFORMATICS, Issue October 02,
2021, p. 4.
Vishwakarma, R., 2019. A Honeypot with Machine Learning based Detection Framework for
defending IoT based Botnet DDoS Attacks. Third International Conference on Trends in
Electronics and Informatics (ICOEI 2019), p. 6.
Wang, M. D. a. K., 2020. An SDN-Enabled Pseudo-Honeypot Strategy for Distributed Denial of

Service Attacks in Industrial Internet of Things. IEEE TRANSACTIONS ON INDUSTRIAL
INFORMATICS, 16, NO(1, JANUARY 2020), p. 10.
Zhang, J., 2021. AntiConcealer: Reliable Detection of Adversary Concealed Behaviors in

EdgeAI Assisted IoT. p. 10.
Abdou, A., 2021. HoneyModels: Machine Learning Honeypots. Special Topics in Military
Communications, p. 6.
Ariffin, T. A. M. T., 2022. IoT attacks and mitigation plan: A preliminary study with Machine
Learning Algorithms. p. 6.
28
BO-XIANG WANG, J.-L. C., 2022. An AI-Powered Network Threat Detection System. IEEE,
Issue May 25, 2022, p. 9.
Das, R. R., 2022. Securing IoT devices using Ensemble Machine Learning in Smart Home
Management System. p. 8.
Devi, B. T., 2020. An Appraisal over Intrusion Detection Systems in Cloud Computing Security
Attacks. Innovative Mechanisms for Industry Applications (ICIMIA 2020), p. 6.
Ellouh, M., 2022. IoTZeroJar: Towards a Honeypot Architecture for Detection of Zero-Day
Attacks in IoT. p. 7.
Huang, C., 2019. Automatic Identification of Honeypot Server Using Machine Learning
Techniques. Security and Communication Networks, Volume 9, p. 9.
Iqbal, Z., 2022. Denial of Service (DoS) Defences against Adversarial Attacks in IoT Smart
Home Networks using Machine Learning Methods. NUST Journal of Engineering Sciences,
Volume Vol. 15, No. 1, p. 8.
Jiang, K., 2020. Design and Implementation of A Machine Learning Enhanced Web Honeypot
System. 2020 13th International Congress on Image and Signal Processing, BioMedical
Engineering and Informatics (CISP-BMEI), p. 5.
Kostopoulos, A., 2020. Realising Honeypot-as-a-Service for Smart Home Solutions. p. 6.
Matin, I. M. M., 2019. Malware Detection Using Honeypot and Machine Learning. p. 4.
Mfogo, V. S., 2023. AIIPot: Adaptive Intelligent-Interaction Honeypot for IoT Devices. p. 7.
Qiu, T., 2020. An Adaptive Social Spammer Detection Model with Semi-supervised Broad
Learning. p. 14.
Radoglou-Grammatikis, P., 2022. Strategic Honeypot Deployment in Ultra-Dense Beyond 5G

Networks: A Reinforcement Learning Approach. IEEE, p. 12.
Sankaran, K. S., 2023. Deep learning-based energy efficient optimal RMC-CNN model for
secured data transmission and anomaly detection in industrial IOT. Sustainable Energy
Technologies and Assessments , Issue 4 January 2023, p. 8.
29
Sumadi, F. D. S., 2022. SD-Honeypot Integration for Mitigating DDoS Attack Using Machine
Learning Approaches. INTERNATIONAL JOURNAL ON INFORMATICS VISUALIZATION,
Issue March 2022, p. 6.
Tsochev, G., 2021. Using Machine Learning Reacted with Honeypot Systems for Securing
Network. International Conference AUTOMATICS AND INFORMATICS, Issue October 02,
2021, p. 4.
Vishwakarma, R., 2019. A Honeypot with Machine Learning based Detection Framework for
defending IoT based Botnet DDoS Attacks. Third International Conference on Trends in
Electronics and Informatics (ICOEI 2019), p. 6.
Wang, M. D. a. K., 2020. An SDN-Enabled Pseudo-Honeypot Strategy for Distributed Denial of

Service Attacks in Industrial Internet of Things. IEEE TRANSACTIONS ON INDUSTRIAL
INFORMATICS, 16, NO(1, JANUARY 2020), p. 10.
Zhang, J., 2021. AntiConcealer: Reliable Detection of Adversary Concealed Behaviors in

EdgeAI Assisted IoT. p. 10.
Vishwakarma, R. and Jain, A. K. (2019) “A Honeypot with Machine Learning based Detection
Framework for defending IoT based Botnet DDoS Attacks,” in 2019 3rd International
Conference on Trends in Electronics and Informatics (ICOEI). IEEE, pp. 1019–1024
AlMahmeed, Y. S. and Al-Omay, A. Y. (2022) “Zero-day attack solutions using threat hunting
intelligence: Extensive survey,” in 2022 International Conference on Data Analytics for Business
and Industry (ICDABI). IEEE, pp. 309–314.
Shahid, W. B. et al. (2022) “A deep learning assisted personalized deception system for
countering web application attacks,” Journal of information security and applications,
67(103169), p. 103169. doi: 10.1016/j.jisa.2022.103169
Lee, S. et al. (2021) “Classification of botnet attacks in IoT smart factory using honeypot
combined with machine learning,” PeerJ. Computer science, 7(e350), p. e350. doi:
10.7717/peerj-cs.350.
Ahmad, R. and Alsmadi, I. (2021) “Machine learning approaches to IoT security: A systematic
literature review,” Internet of Things, 14(100365), p. 100365. doi: 10.1016/j.iot.2021.100365.
30
Hamza, A. A. et al. (2022) “HSAS-MD analyzer: A hybrid security analysis system using model-
checking technique and deep learning for malware detection in IoT apps,” Sensors (Basel,
Switzerland), 22(3), p. 1079. doi: 10.3390/s22031079.
Gyamfi, E. and Jurcut, A. (2022) “Intrusion detection in Internet of Things systems: A review on
design approaches leveraging multi-access edge computing, machine learning, and
datasets,” Sensors (Basel, Switzerland), 22(10), p. 3744. doi: 10.3390/s22103744.
Sharma, S., Lone, F. R. and Lone, M. R. (2020) “Machine learning for enhancement of security
in internet of things based applications,” in Security and Privacy in the Internet of Things. 1st
Edition. Chapman and Hall/CRC, pp. 95–108.
Jha, C. K., Biswas, S. S. and Nafis, M. T. (2023) “A comprehensive system for smart homes with
a minimalist information security framework,” in Information and Communication Technology
for Competitive Strategies (ICTCS 2021). Singapore: Springer Nature Singapore, pp. 401–411.
Scott, E. et al. (2022) “Optimising user security recommendations for AI-powered smart-homes,”
in 2022 IEEE Conference on Dependable and Secure Computing (DSC). IEEE, pp. 1–8.
Ali, S. S. and Choi, B. J. (2020) “State-of-the-art artificial intelligence techniques for distributed
smart grids: A review,” Electronics, 9(6), p. 1030. doi: 10.3390/electronics9061030.
Amraoui, N. and Zouari, B. (2022) “Securing the operation of Smart Home Systems: a literature
review,” Journal of reliable intelligent environments, 8(1), pp. 67–74. doi: 10.1007/s40860-021-
00160-3.
Viegas, E. K. et al. (2023) “A dynamic machine learning scheme for reliable network-based
intrusion detection,” in Advanced Information Networking and Applications. Cham: Springer
International Publishing, pp. 439–451.
Kavitha, A. and Priyanka, R. (2022) “Analysis of novel face recognition system to minimize the
false identification rate using fast Fourier transform in comparison with wavelet transform,”
in 2022 14th International Conference on Mathematics, Actuarial Science, Computer Science
and Statistics (MACS). IEEE, pp. 1–5.
31
El Kamel, N. et al. (2020) “A smart agent design for cyber security based on honeypot and
machine learning,” Security and communication networks, 2020, pp. 1–9. doi:
10.1155/2020/8865474.
Koroniotis, N., Moustafa, N. and Sitnikova, E. (2019) “Forensics and deep learning mechanisms
for botnets in internet of things: A survey of challenges and solutions,” IEEE access: practical
innovations, open solutions, 7, pp. 61764–61785. doi: 10.1109/access.2019.2916717.
Joseph, T. A. and Jayapandian, N. (2022) “Detection of various security threats in IoT and cloud
computing using machine learning,” in 2022 International Conference on Sustainable Computing
and Data Communication Systems (ICSCDS). IEEE, pp. 996–1001.
Meera, A. J., Kantipudi, M. V. V. P. and Aluvalu, R. (2021) “Intrusion detection system for the
IoT: A comprehensive review,” in Advances in Intelligent Systems and Computing. Cham:
Springer International Publishing, pp. 235–243.
Saad, R. M. A., Soufy, K. A. M. A. and Shaheen, S. I. (2023) “Security in smart home

environment: issues, challenges, and countermeasures - a survey,” International journal of
security and networks, 18(1), p. 1. doi: 10.1504/ijsn.2023.129887.
Dowling, S., Schukat, M. and Barrett, E. (2019) “Using reinforcement learning to conceal
honeypot functionality,” in Machine Learning and Knowledge Discovery in Databases. Cham:
Springer International Publishing, pp. 341–355

Proposal

Uploaded by

Copyright:

Available Formats

Proposal

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Proposal

Uploaded by

Copyright:

Available Formats

Implementing a honeypot for IOT Smart Homes to cope with zero-day

attacks using machine learning

The scope of this research is:

Research Exclusion Criteria

IoT Smart Home Networks using Machine Learning Methods

Honeypot Architecture for Detection of Zero-Day Attacks in IoT

Machine Learning Approaches

Honeypot-as-a-Service for Smart Home Solutions

Honeypot and Machine Learning

Honeypot for IoT Devices Through Reinforcement Learning

To evaluate the effectiveness of a proposed deep learning-based approach to synthetic IoT

Honeypot for IoT Protocols based on Android

Figure 1: Proposed Model

Designing the honeypot infrastructure

Selecting the Machine Learning Algorithms

Types of Machine Learning

Support Vector Machine

Training the ML Model

Figure 2: Proposed ML Methodology

Education, I. C., 2020. What is machine learning. [Online]

IBM, 2020. What is Supervised Learning?. [Online]

IBM, 2022. What is a decision tree. [Online]

IBM, 2022. What is random forest?. [Online]

Jon, Y., 2021. Support Vector Machine (SVM) Algorithm. [Online]

Kostopoulos, A., 2020. Realising Honeypot-as-a-Service for Smart Home Solutions. p. 6.

Kumar, G. e. a., 2023. Malware Identification using a set of Transparent Honeypot in

Lutkevich, B. C. C. a. C. M., 2021. What is a honeypot?. [Online]

Menon, K., 2021. An introduction to the types of machine learning,. [Online]

Radoglou-Grammatikis, P., 2022. Strategic Honeypot Deployment in Ultra-Dense Beyond 5G

Ray, S., 2017. analyticsvidhya.com. [Online]

Wang, M. D. a. K., 2020. An SDN-Enabled Pseudo-Honeypot Strategy for Distributed Denial of

Zhang, J., 2021. AntiConcealer: Reliable Detection of Adversary Concealed Behaviors in

Kostopoulos, A., 2020. Realising Honeypot-as-a-Service for Smart Home Solutions. p. 6.

Radoglou-Grammatikis, P., 2022. Strategic Honeypot Deployment in Ultra-Dense Beyond 5G

Wang, M. D. a. K., 2020. An SDN-Enabled Pseudo-Honeypot Strategy for Distributed Denial of

Zhang, J., 2021. AntiConcealer: Reliable Detection of Adversary Concealed Behaviors in

Saad, R. M. A., Soufy, K. A. M. A. and Shaheen, S. I. (2023) “Security in smart home

You might also like