Major Doc Adhooora

Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

A

Major Project
On

Data Poisoning Attacks on Federated Machine Learning


(Submitted in partial fulfilment of the requirements for the award of the Degree)

BACHELOR OF TECHNOLOGY
In
INFORMATION TECHNOLOGY
BY
S.Devesh Kumar (Regd.No:207R1A1251)
B.Vishal Adithya (Regd.No:217R5A1204)
G.Surakshitha (Regd.No:207R1A1217)

T.Vamshi Krishna (Regd.No:207R1A1255)

Under the Guidance of


K.Srinu
(Assistant Professor)

DEPARTMENT OF INFORMATION TECHNOLOGY


CMR TECHNICAL CAMPUS
UGC AUTONOMOUS
(Accredited by NAAC, NBA, Permanently Affiliated to JNTUH, Approved by AICTE, New
Delhi) Recognized Under Section 2(f) & 12(B) of the UGC Act.1956,
Kandlakoya(V), Medchal Road, Hyderabad-501401.
2020-2024
Data Poisoning Attacks on Federated Machine Learning

DEPARTMENT OF INFORMATION TECHNOLOGY

CERTIFICATE
This is to certify that the project entitled “Data Poisoning Attacks on Federated
Machine Learning" being submitted by S. Devesh Kumar(207R1A1251), B. Vishal Adithya
(217R5A1204), G. Surakshitha (207R1A1217), T. Vamshi Krishna (207R1A1255) in partial
fulfilment of the requirements for the award of the degree of B. Tech in Information Technology
of the Jawaharlal Nehru Technological University Hyderabad, is a record of the bonafide work
carried out by them under our guidance and supervision during the year 2023-2024.

The results embodied in this thesis have not been submitted to any other university
or institute for the award of any degree or diploma

Mr. K. Srinu Dr. A. Raji Reddy


(Assistant Professor) DIRECTOR
INTERNAL GUIDE

EXTERNAL
Dr. B. Kavitha Rani EXAMINER
HEAD OF THE DEPARTMENT

Submitted for viva voce Examination held on


ACKNOWLEDGEMENT

Apart from the efforts of us, the success of any project depends largely on the
encouragement and guidelines of many others. We take this opportunity to express our
gratitude to the people who have been instrumental in the successful completion of this
project.
We take this opportunity to express my profound gratitude and deep regard to my
guide Mr. K. Srinu, Assistant Professor, for his exemplary guidance, monitoring, and
constant encouragement throughout the project work.
We also take this opportunity to express a deep sense of gratitude to Project
Review Committee (PRC) Coordinator: Mr. MD. Sajid Pasha, Associate Professor,
for their cordial support, valuable information, which helped us in completing this task
through various stages.
We are also thankful to the Head of the Department Dr. B. Kavitha Rani for
providing excellent infrastructure and a nice atmosphere for completing this project
successfully.
We would like to express our sincere gratitude to Dr. M. Ahmed Ali Baig, Dean
Administration, Dr. DTV. Dharmajee Rao, Dean Academics, Dr. Ashutosh Saxena,
Dean of R&D for encouragement throughout the course of this presentation.
We are obliged to our Director Dr. A. Raji Reddy for being cooperative
throughout the course of this project.
We would like to express our sincere gratitude to our Management of CMR
Technical Campus, Hyderabad, Sri. Ch. Gopal Reddy, Honourable Chairman, Smt
C. Vasantha Latha, Honourable Secretary, Sri. C. Abhinav Reddy, Honourable Chief
Executive Officer.
The guidance and support received from all the members of CMR TECHNICAL
CAMPUS who contributed and who are contributing to this project was vital for the
success of the project. We are grateful for their constant support and help.
Finally, we would like to take this opportunity to thank our family for their
constant encouragement without which this assignment would not be possible.

S. Devesh Kumar (207R1A1251)


B. Vishal Adithya (217R5A1204)
G. Surakshitha (207R1A1217)
T.Vamshi Krishna(207R1A1255)
Data Poisoning Attacks on Federated Machine Learning

ABSTRACT

Federated machine learning which enables resource constrained node devices (e.g.,
mobile phones and IoT devices) to learn a shared model while keeping the training data
local, can provide privacy, security and economic benefits by designing an effective
communication protocol. However, the communication protocol amongst different
nodes could be exploited by attackers to launch data poisoning attacks, which has been
demonstrated as a big threat to most machine learning models.
In this paper, we attempt to explore the vulnerability of federated machine learning.
More specifically, we focus on attacking a federated multi-task learning framework,
which is a federated learning framework via adopting a general multi-task learning
framework to handle statistical challenges. We formulate the problem of computing
optimal poisoning attacks on federated multi-task learning as a bilevel program that is
adaptive to arbitrary choice of target nodes and source attacking nodes.
Then we propose a novel systems-aware optimization method, ATTack on Federated
Learning (AT2FL), which is efficiency to derive the implicit gradients for poisoned
data, and further compute optimal attack strategies in the federated machine learning.
Our work is an earlier study that considers issues of data poisoning attack for federated
learning. To the end, experimental results on real-world datasets show that federated
multi-task learning model is very sensitive to poisoning attacks, when the attackers
either directly poison the target nodes or indirectly poison the related nodes by
exploiting the communication protocol.

i
Data Poisoning Attacks on Federated Machine Learning

LIST OF FIGURES

FIGURE NO NAME OF THE FIGURE PAGE NO

Figure 5.1 Proposed Architecture 15

Figure 5.3.1 Use Case Diagram 17

Figure 5.3.2 Class Diagram 18

Figure 5.3.3 Sequence Diagram 18

Figure 5.3.4 Collaboration Diagram 19

Figure 5.3.5 Activity Diagram 20

ii
LIST OF SCREENSHOTS

SCREENSHOT NO SCREENSHOT NAME PAGE NO


Figure 7.1 Home page 26

Figure 7.2 Upload video 26


Figure 7.3 Video Loaded 27

Figure 7.4 Click on ‘Get real-time vehicle reading’ 27


Figure 7.5 New frames 28
Figure 7.6 Detecting vehicles 28

iii
INDEX

ABSTRACT ⅰ

LIST OF FIGURES ⅱ

LIST OF SCREENSHOTS ⅲ

1. INTRODUCTION 3

1.1 INTRODUCTION 4

1.2 PROBLEM STATEMENT 4

1.3 OBJECTIVES 5

1.2 LIMITATIONS 5

2. SYSTEM ANALYSIS 6

2.1 INTRODUCTION 7

2.2 EXISTING SYSTEM 7

2.3 DISADVANTAGES OF EXISTING SYSTEM 7

2.4 PROPOSED DESIGN 8

2.5 ADVANTAGES OF PROPOSED SYSTEM 8

3. SYSTEM STUDY 9
1
3.1 FEASIBILITY STUDY 0
1
3.1.1 ECONOMICAL FEASIBILITY 0

3.1.2 TECHNICAL FEASIBILITY 0

3.1.3 SOCIAL FEASIBILITY 11

1
CMRTC
Data Poisoning Attacks on Federated Machine Learning

4. SYSTEM ANALYSIS 12
4.1 REQUIREMENTS
13
4.2 HARDWARE REQUIREMENTS 13

5. SYSTEM DESIGN 14
5.1 INTRODUCTION 15

5.2 ARCHITECTURE 15

5.3 UNIFIED MODELING LANGUAGE 16

5.3.1 USECASE DIAGRAM 17

5.3.2 CLASS DIAGRAM 18

5.3.3 SEQUENCE DIAGRAM 18

5.3.4 COLLABORATION DIAGRAM 19

5.3. 5 ACTIVITY DIAGRAM 20

6. IMPLEMENTATION 21
6.1 MODULES 22

6.2 MODULE DESCRIPTION 22

6.3 SOURCE CODE 23

7. SCREENSHOTS 25
8.TESTING 29

8.1 INTRODUCTION 30

8.2 FEATURES TO BE TESTED 32

9. CONCLUSION & FUTURE SCOPE 34


35
9.1 PROJECT CONCLUSION
35
9.2 FUTURE SCOPE

10. BIBLIOGRAPHY 36

10.1 REFERENCES 37
2
CMRTC
Data Poisoning Attacks on Federated Machine Learning

1.INTRODUCTION

3
CMRTC
Data Poisoning Attacks on Federated Machine Learning

1.1 INTRODUCTION
Machine learning has been widely-applied into a broad array of applications,
e.g., spam filtering and natural gas price prediction. Among these applications, the reliability
or security of the machine learning system has been a great concern, including adversaries. For
example, for product recommendation system, researchers can either rely on public crowd-
sourcing platform, e.g., Amazon Mechanical Turk or Taobao, or private teams to collect
training datasets. However, both of these above methods have the opportunity of being injected
corrupted or poisoned data by attackers. To improve the robustness of real-world machine
learning systems, it is critical to study how well machine learning performs under the poisoning
attacks. For the attack strategy on machine learning methods, it can be divided into two
categories: causative attacks and exploratory attacks, where exploratory attacks influence
learning via controlling over training data, and exploratory attacks can take use of
misclassifications without affecting training.

1.2 PROBLEM STATEMENT


The project focuses on investigating data poisoning attacks within Federated
Machine Learning (FML) systems, where multiple parties collaboratively train a global
model while keeping their data decentralized. The primary objectives include
understanding various forms of data poisoning attacks specific to FML environments,
developing effective detection and migration strategies to safeguard against these
attacks, evaluating the impact of such attacks on model performance metrics, and
ensuring scalability and efficiency of proposed defence mechanisms. By addressing
these challenges, the project aims to enhance the security and reliability of FML systems,
enabling their widespread adoption across diverse domains while preserving the privacy
and integrity of participants data.

4
CMRTC
Data Poisoning Attacks on Federated Machine Learning

1.3 OBJECTIVES
Objectives for the project are,
• Classify data poisoning attacks in federated machine learning, distinguishing
between conventional and FML-specific strategies.
• Develop detection mechanisms to identify malicious data injected during
decentralized training in FML systems, utilizing anomaly detection and
adversarial example analysis.
• Implement mitigation strategies to counteract data poisoning, including resilient
aggregation algorithms and cryptographic protocols, ensuring model integrity.
• Evaluate the impact of data poisoning on FML model performance, considering
accuracy, convergence rate, and susceptibility to adversarial manipulation, while
ensuring scalability and efficiency of proposed defenses.

1.4 LIMITATIONS
Limitations for the project are,

• Data Availability: The effectiveness of proposed detection and mitigation strategies


may be limited by the availability and diversity of training data from decentralized
participants, which could impact the robustness of the defense mechanisms.

• Adversarial Knowledge: The project's effectiveness in countering sophisticated data


poisoning attacks may be constrained by the level of adversarial knowledge possessed
by attackers, as they may adapt their strategies to evade detection and mitigation
techniques.

• Computational Overhead: The implementation of defense mechanisms such as


resilient aggregation algorithms and cryptographic protocols may introduce
computational overhead, potentially impacting the scalability and efficiency of
federated machine learning systems, particularly in resource-constrained environments.

• Generalization: The findings and recommendations generated from the project may be
specific to certain types of data poisoning attacks or FML configurations, limiting their
generalizability across diverse domains and scenarios

5
CMRTC
Data Poisoning Attacks on Federated Machine Learning

2. SYSTEM ANALYSIS

6
CMRTC
Data Poisoning Attacks on Federated Machine Learning

2. SYSTEM ANALYSIS

2.1 INTRODUCTION

Federated Machine Learning (FML) enables collaborative model training


across decentralized environments but faces security risks like data poisoning attacks.
These attacks aim to compromise model integrity by injecting malicious data during
training. Understanding and mitigating data poisoning in FML systems are crucial for
ensuring model reliability. This project investigates data poisoning attacks within
FML, focusing on identifying attack vectors, developing detection and mitigation
strategies, and evaluating their efficacy. The goal is to enhance the security of FML
systems, enabling their adoption across diverse domains while safeguarding data
privacy and integrity.

2.2 EXISTING SYSTEM


For the data poisoning attacks, it has become an urgent research field in the adversarial machine
learning, in which the target is against machine learning algorithms. The earlier attempt that
investigates the poisoning attacks on support vector machines (SVM), where the adopted attack
uses a gradient ascent strategy in which the gradient is obtained based on properties of the
SVM’s optimal solution.
The vulnerability of multi-task learning. However, the motivations for and our work are
significantly different as follows, the data sample in are put together, which is different from
the scenario in federated machine learning, i.e., machine learning models are built based on
datasets, that are distributed across multiple nodes/devices while preventing data leakage.

2.3 DISADVANTAGES OF EXISTING SYSTEM

The main disadvantages of existing system are,

• No Specific Model: The system doesn’t have data poisoning attack model on federated
machine learning.
• No Data Integrity: There is no technique called Data Integrity Check on data
poisoning attacks.

7
CMRTC
Data Poisoning Attacks on Federated Machine Learning

2.4 PROPOSED SYSTEM

The system proposes a bilevel optimization framework to compute optimal poisoning


attacks on federated machine learning. To our best knowledge, this is an earlier
attempt to explore the vulnerability of federated machine learning from the
perspective of data poisoning.
The proposed system derives an effective optimization method, i.e., Attack on
Federated Learning (AT2FL), to solve the optimal attack problem, which can address
systems challenges associated with federated machine learning.
The proposed system demonstrates the empirical performance of our optimal attack
strategy, and our proposed AT2FL algorithm with several real-world datasets. The
experiment results indicate that the communication protocol among multiple nodes
opens a door for attacker to attack federated machine learning.

2.5 ADVANTAGES OF PROPOSED SYSTEM


Advantages of the Proposed System are,

• Defence Mechanisms:
o Integrates defence strategies specifically designed to address the decentralized
nature and unique challenges of federated machine learning environments.
• Enhanced Adaptability:
o Adaptable to the dynamic nature of federated learning settings, ensuring robust
protection against evolving data poisoning attacks.
• Comprehensive Coverage:
o Provides comprehensive coverage across various machine learning models
commonly employed in federated settings, ensuring a holistic approach to
defending against attacks.
• Scalability and Practical Applicability:
o Designed with scalability in mind, enabling practical deployment in real-world
scenarios with large-scale and evolving datasets while maintaining effectiveness
against sophisticated attacks.

8
CMRTC
Data Poisoning Attacks on Federated Machine Learning

3.SYSTEM STUDY

9
CMRTC
Data Poisoning Attacks on Federated Machine Learning

3. SYSTEM STUDY

3.1 FEASIBILITY STUDY


The feasibility of the project is analyzed in this phase and business proposal is put forth with a
very general plan for the project and some cost estimates. During system analysis the feasibility
study of the proposed system is to be carried out. This is to ensure that the proposed system is
not a burden to the company. For feasibility analysis, some understanding of the major
requirements for the system is essential.

Three key considerations involved in the feasibility analysis are:

3.1.1 ECONOMICAL FEASIBILITY

This study is carried out to check the economic impact that the system will
have on the organization. The amount of fund that the company can pour into the
research and development of the system is limited. The expenditures must be justified.
Thus the developed system as well within the budget and this was achieved because
most of the technologies used are freely available. Only the customized products had
to be purchased.

3.1.2 TECHNICAL FEASIBILITY


This study is carried out to check the technical feasibility, that is, the
technical requirements of the system. Any system developed must not have a high
demand on the available technical resources. This will lead to high demands on the
available technical resources. This will lead to high demands being placed on the
client. The developed system must have a modest requirement, as only minimal or
null changes are required for implementing this system.

10
CMRTC
Data Poisoning Attacks on Federated Machine Learning

3.1.3 SOCIAL FEASIBILITY

The aspect of study is to check the level of acceptance of the system by


the user. This includes the process of training the user to use the system efficiently.
The user must not feel threatened by the system, instead must accept it as a necessity.
The level of acceptance by the users solely depends on the methods that are employed
to educate the user about the system and to make him familiar with it. His level of
confidence must be raised so that he is also able to make some constructive criticism,
which is welcomed, as he is the final user of the system.

11
CMRTC
Data Poisoning Attacks on Federated Machine Learning

4.SYSTEM REQUIREMENTS

12
CMRTC
Data Poisoning Attacks on Federated Machine Learning

4.SYSTEM REQUIREMENTS

4.1 SOFTWARE REQUIREMENTS


Operating system : Windows 10 or above
Programming : Python (3.6.3)
Language
IDE : Visual Studio Code
Backend : Django
Frontend : HTML, CSS, JavaScript
Database : MySQL

4.2 HARDWARE REQUIREMENTS:

System : Intel i5 Processor

Hard Disk : 512GB SSD or above.

RAM : 8GB.

13
CMRTC
Data Poisoning Attacks on Federated Machine Learning

5.SYSTEM DESIGN

14
CMRTC
Data Poisoning Attacks on Federated Machine Learning

5. SYSTEM DESIGN

5.1 INTRODUCTION

Architecture defines the components, modules interfaces and data for a system
to satisfy specified requirements. One should see as the applications of the systems
theory to product development.

5.2 ARCHITECTURE

FIGURE 5.1 PROPOSED ARCHITECTURE

15
CMRTC
Data Poisoning Attacks on Federated Machine Learning

5.3 UNIFIED MODELING LANGUAGE (UML)

UML stands for Unified Modelling Language. UML is a standardized general


purpose modelling language in the field of object-oriented software engineering. The
standard is managed, and was created by, the Object Management Group.

GOALS OF UML:

The Primary goals in the design of the UML are as follows:

1. Provide users a ready-to-use, expressive visual modelling Language so that they


can develop and exchange meaningful models.

2. Provide extendibility and specialization mechanisms to extend the core concepts.

3. Be independent of particular programming languages and development process.

4. Provide a formal basis for understanding the modelling language.

16
CMRTC
Data Poisoning Attacks on Federated Machine Learning

5.3.1 USE CASE DIAGRAM

A use case diagram in the Unified Modelling Language (UML) is a type of behavioural
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented
as use cases), and any dependencies between those use cases. The main purpose of a use case
diagram is to show what system functions are performed for which actor. The roles of the
actors in the system can be depicted.

FIGURE 5.2 USE CASE DIAGRAM

17
CMRTC
Data Poisoning Attacks on Federated Machine Learning

5.3.2 CLASS DIAGRAM


In software engineering, a class diagram in the Unified Modelling Language
(UML) is a type of static structure diagram that describes the structure of a system by
showing the system's classes, their attributes, operations (or methods), and the
relationships among the classes. It explains which class contains information.

FIGURE 5.3 CLASS DIAGRAM

18
CMRTC
Data Poisoning Attacks on Federated Machine Learning

5.3.3 SEQUENCE DIAGRAM


A sequence diagram in Unified Modelling Language (UML) is a kind of
interaction diagram that shows how processes operate with one another and in what
order. It is a construct of a Message Sequence Chart. Sequence diagrams are
sometimes called event diagrams, event scenarios, and timing diagrams.

FIG 5.4: SEQUENCE DIAGRAM

19
CMRTC
Data Poisoning Attacks on Federated Machine Learning

5.3.4 COLLABORATION DIAGRAM:


In the Unified Modelling Language, a collaboration diagram depicts how
components are wired together to form larger components and or software systems.
They are used to illustrate the structure of arbitrarily complex systems.
Components are wired together by using an assembly connector to connect the
required interface of one component with the provided interface of another
component. This illustrates the service consumer - service provider relationship
between the two components.

FIG:5.5 COLLABORATION DIAGRAM

20
CMRTC
Data Poisoning Attacks on Federated Machine Learning

5.3.5 ACTIVITY DIAGRAM:


Activity diagrams are graphical representations of workflows of stepwise activities and
actions with support for choice, iteration and concurrency. In the Unified Modeling Language,
activity diagrams can be used to describe the business and operational step-by-step workflows
of components in a system. An activity diagram shows the overall flow of control.

FIG5.6 ACTIVITY DIAGRAM

21
CMRTC
Data Poisoning Attacks on Federated Machine Learning

6.IMPLEMENTATION

22
CMRTC
Data Poisoning Attacks on Federated Machine Learning

6. IMPLEMENTATION
6.1 MODULES
Module.
1. Bi-Level Optimization Module.
2. Attack to Federated Learning (AT2FL) Module.
3. Optimal Attack Strategy Module.

6.2 MODULES DESCRIPTION

1. Bi-Level Optimization Module: We propose a bilevel optimization framework


to compute optimal poisoning attacks on federated machine learning. To our
best knowledge, this is an earlier attempt to explore the vulnerability of
federated machine learning from the perspective of data poisoning.

2. AT2FL Module: We derive an effective optimization method, i.e., ATTack on


Federated Learning (AT2FL), to solve the optimal attack problem, which can
address systems challenges associated with federated machine learning.

3. Optimal Attack Strategy Module: We demonstrate the empirical performance


of our optimal attack strategy, and our proposed AT2FL algorithm with several
real-world datasets. The experiment results indicate that the communication
protocol among multiple nodes opens a door for attacker to attack federated
machine learning.

23
CMRTC
Data Poisoning Attacks on Federated Machine Learning

6.3 SOURCE CODE:

pip install numpy==1.18.1


pip install matplotlib==3.1.3
pip install pandas==0.25.3
pip install opencv-python==4.2.0.32
pip install keras==2.3.1
pip install tensorflow==1.14.0
pip install h5py==2.10.0
pip install pillow==7.0.0
pip install sklearn-genetic==0.2
pip install SwarmPackagePy
pip install sklearn
pip install scikit-learn==0.22.2.post1
Pip install sklearn-extensions==0.0.2
Pip install pyswarms==1.1.0
pip install protobuf==3.20.0 --user
pip install nltk

pip install django==2.1.7


pip install pymysql==0.9.3

pip install matplotlib==3.1.3 (used for display Graph)


pip install pandas==0.25.3 (Used to read dataset)
pip install opencv-python (used for image reading)
pip install keras==2.3.1 (used for neural Network Implementation)
pip install tensorflow==1.14.0 (used for CNN implementation)
pip install h5py==2.10.0 (used for support to tensorflow and keras libraries)
pip install sklearn (used for Machine learning Algorithms implementation like
Decision tree,randomforest Tree,etc)

pip install --only-binary :all: mysqlclient --user

pip install mysqlclient --user

sc delete mysql

import pymysql
pymysql.install_as_MySQLdb()

predict = model.predict_classes(test)

pip install --user -U nltk


python
>>> import nltk
24
CMRTC
Data Poisoning Attacks on Federated Machine Learning

>>> nltk.download()
pip install -r requirements.txt
global filename
text.delete('1.0', END)
filename = filedialog.askopenfilename(initialdir="dataset")
dataset = pd.read_csv(filename)
,on_delete=models.CASCADE,
python -m pip install –-user -r requirements.txt
3.6.2 python
django==1.11.6
mysqlclient==1.3.12
<?xml version="1.0" encoding="UTF-8"?>
<module type="PYTHON_MODULE" version="4">
<component name="NewModuleRootManager">
<content url="file://$MODULE_DIR$">
<excludeFolder url="file://$MODULE_DIR$/venv" />
</content>
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
<component name="TestRunnerService">
<option name="PROJECT_TEST_RUNNER" value="Unittests" />
</component>
</module>
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="ProjectRootManager" version="2" project-jdk-name="Python
3.6 (venv) (129)" project-jdk-type="Python SDK" />
</project>

<?xml version="1.0" encoding="UTF-8"?>


<project version="4">
<component name="ProjectModuleManager">
<modules>
<module fileurl="file://$PROJECT_DIR$/.idea/Malware_Detection.iml"
filepath="$PROJECT_DIR$/.idea/Malware_Detection.iml" />
</modules>
</component>
</project>
<entry
<module type="PYTHON_MODULE" version="4">
<component name="NewModuleRootManager">
<content url="file://$MODULE_DIR$">
<excludeFolder url="file://$MODULE_DIR$/venv" />
</content>
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
25
CMRTC
Data Poisoning Attacks on Federated Machine Learning

<component name="TestRunnerService">
<option name="PROJECT_TEST_RUNNER" value="Unittests" />
</component>
</module>
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="ProjectModuleManager">
<modules>
<module fileurl="file://$PROJECT_DIR$/.idea/Malware_Detection.iml"
filepath="$PROJECT_DIR$/.idea/Malware_Detection.iml" />
</modules>
</component>
</project>
<entry
<module type="PYTHON_MODULE" version="4">
<component name="NewModuleRootManager">
<content url="file://$MODULE_DIR$">
<excludeFolder url="file://$MODULE_DIR$/venv" />
</content>
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
<component name="TestRunnerService">
<option name="PROJECT_TEST_RUNNER" value="Unittests" />
</component>
</module>
<module type="PYTHON_MODULE" version="4">
<component name="NewModuleRootManager">
<content url="file://$MODULE_DIR$">
<excludeFolder url="file://$MODULE_DIR$/venv" />
</content>
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
<component name="TestRunnerService">
<option name="PROJECT_TEST_RUNNER" value="Unittests" />
</component>
</module>
</xml>

26
CMRTC
Data Poisoning Attacks on Federated Machine Learning

7.SCREENSHOTS

27
CMRTC
Data Poisoning Attacks on Federated Machine Learning

7. SCREENSHOTS

7.1 Starting the Server:

28
CMRTC
Data Poisoning Attacks on Federated Machine Learning

7.2 Login & Registration Page:

29
CMRTC
Data Poisoning Attacks on Federated Machine Learning

7.3 Accuracy Score’s:

30
CMRTC
Data Poisoning Attacks on Federated Machine Learning

7.4 Visualization of Output:

31
CMRTC
Data Poisoning Attacks on Federated Machine Learning

8.TESTING

32
CMRTC
Data Poisoning Attacks on Federated Machine Learning

8. TESTING
The purpose of testing is to discover errors. Testing is the process of trying to
discover every conceivable fault or weakness in a work product. It provides a way
to check the functionality of components, sub-assemblies, assemblies and/or a
finished product it is the process of exercising software with the intent of ensuring
that the Software system meets its requirements and user expectations and does
not fail unacceptably. There are various types of test. Each test type addresses a
specific testing requirement.

8.1 TYPES OF TESTS

8.1.1 Unit testing


Unit testing involves the design of test cases that validate that the
internal program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is the
testing of individual software units of the application .it is done after the completion
of an individual unit before integration. This is a structural testing, that relies on
knowledge of its construction and is invasive. Unit tests perform basic tests at the
component level and test a specific business process, application, and/or system
configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and
expected results.

8.1.2 Integration testing


Integration tests are designed to test integrated software components to
determine if they actually run as one program. Testing is event-driven and is more
concerned with the basic outcome of screens or fields. Integration tests demonstrate that
although the components were individually satisfied, as shown by successful unit testing,
the combination of components is correct and consistent.

33
CMRTC
Data Poisoning Attacks on Federated Machine Learning

8.1.3 Functional test


Functional tests provide systematic demonstrations that functions tested are
available as specified by the business and technical requirements, system
documentation, and user manuals.
Functional testing is centered on the following items:

Valid Input : identified classes of valid input must be accepted.


Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be exercised.
Systems/Procedures : interfacing systems or procedures must be invoked.

Organization and preparation of functional tests is focused on


requirements, key functions, or special test cases. In addition, systematic coverage
pertaining to identifying Business process flows; data fields, predefined processes,
and successive processes must be considered for testing. Before functional testing is
complete, additional tests are identified and the effective value of current tests is
determined.
8.1.4 System Test
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration-oriented system integration test.
System testing is based on process descriptions and flows, emphasizing pre-driven
process links and integration points.

8.1.5 White Box Testing


White Box Testing is a testing in which in which the software tester has knowledge
of the inner workings, structure, and language of the software, or at least its purpose.
It is purpose. It is used to test areas that cannot be reached from a black box level.

34
CMRTC
Data Poisoning Attacks on Federated Machine Learning

8.1.6 Black Box Testing


Black Box Testing is testing the software without any knowledge of the
inner workings, structure or language of the module being tested. Black box tests, as
most other kinds of tests, must be written from a definitive source document, such as
specification or requirements document, such as specification or requirements
document.
It is a testing in which the software under test is treated, as a black box You cannot
“see” into it. The test provides inputs and responds to outputs without considering
how the software works.

8.1.7 Unit Testing


Unit testing is usually conducted as part of a combined code and unit
test phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.

8.2 Test strategy and approach

Test objectives

Field testing will be performed manually and functional tests will be


written in detail.
• All field entries must work properly.

• Pages must be activated from the identified link.

• The entry screen, messages and responses must not be delayed.

Features to be tested

• Verify that the entries are of the correct format

• No duplicate entries should be allowed

35
CMRTC
Data Poisoning Attacks on Federated Machine Learning

8.2.1 Integration Testing


Software integration testing is the incremental integration testing of two
or more integrated software components on a single platform to produce failures
caused by interface defects.
The task of the integration test is to check that components or software applications,
e.g. components in a software system or – one step up – software applications at the
company level – interact without error.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.

8.2.2 Acceptance Testing


User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.

36
CMRTC
Data Poisoning Attacks on Federated Machine Learning

9. CONCLUSION
&
FUTURE SCOPE

37
CMRTC
Data Poisoning Attacks on Federated Machine Learning

9. CONCLUSION & FUTURE SCOPE

9.1 PROJECT CONCLUSION


In conclusion, this project has addressed the pressing issue of data poisoning attacks in
Federated Machine Learning (FML) systems. Through thorough investigation and analysis, we
have identified vulnerabilities inherent in decentralized learning environments and developed
specialized defense mechanisms to mitigate the risks associated with such attacks. By
recognizing the unique challenges posed by FML systems, we have proposed effective
strategies to safeguard model integrity and reliability. Our findings emphasize the significance
of combatting data poisoning attacks to ensure the security and trustworthiness of collaborative
machine learning endeavors across diverse domains.

9.2 FURTHER ENHANCEMENT

Moving forward, several avenues for future research and development emerge in the domain
of data poisoning attacks on federated machine learning. These include:

• Enhanced Detection Techniques: Further refinement and exploration of detection


mechanisms to identify subtle instances of data poisoning attacks, utilizing advanced
anomaly detection and adversarial example analysis techniques.

• Robustness Across Models: Extending defense strategies to encompass a broader


range of machine learning models common in federated settings, ensuring
comprehensive protection against diverse attack vectors.

• Scalability and Efficiency: Optimization of defense mechanisms for scalability and


efficiency, enabling practical deployment in real-world federated learning scenarios
with large-scale and dynamic datasets.

• Privacy-Preserving Solutions: Exploration of privacy-preserving methodologies to


mitigate the risk of data leakage and privacy breaches in federated learning systems
while defending against data poisoning attacks.

38
CMRTC
Data Poisoning Attacks on Federated Machine Learning

10. BIBLIOGRAPHY

39
CMRTC
Data Poisoning Attacks on Federated Machine Learning

10. BIBLIOGRAPHY
10.1 REFERENCES

1. Biggio, B., & Roli, F. (2018). Wild patterns: Ten years after the rise of adversarial
machine learning. Pattern Recognition, 84, 317-331.

2. Steinhardt, J., Koh, P. W., Liang, P., & Feizi, S. (2017). Certified defenses against
adversarial examples. arXiv preprint arXiv:1705.07263.

3. Bhagoji, A. N., He, W., Li, B., & Song, D. (2018). Exploring the space of black-box
attacks on deep neural networks. arXiv preprint arXiv:1712.09491.

4. Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., & Shmatikov, V. (2018). How to
backdoor federated learning. arXiv preprint arXiv:1807.00459.

5. Nasr, M., Shokri, R., Houmansadr, A., & Gehrke, J. (2019). Comprehensive privacy
analysis of deep learning: Stand-alone and federated learning under passive and active
white-box inference attacks. arXiv preprint arXiv:1909.02605.

6. Liu, Y., Ma, S., Arai, M., & Masuda, H. (2019). Data poisoning attacks on federated
learning based recommender systems. arXiv preprint arXiv:1908.08311.

7. Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... &
Vinayakumar, R. (2019). Advances and open problems in federated learning. arXiv
preprint arXiv:1912.0497

40
CMRTC
Data Poisoning Attacks on Federated Machine Learning

41
CMRTC

You might also like