Major Doc Adhooora

A
Major Project
On
Data Poisoning Attacks on Federated Machine Learning

(Submitted in partial fulfilment of the requirements for the award of the Degree)
BACHELOR OF TECHNOLOGY
In
INFORMATION TECHNOLOGY
BY
S.Devesh Kumar (Regd.No:207R1A1251)
B.Vishal Adithya (Regd.No:217R5A1204)
G.Surakshitha (Regd.No:207R1A1217)
T.Vamshi Krishna (Regd.No:207R1A1255)
Under the Guidance of

K.Srinu
(Assistant Professor)
DEPARTMENT OF INFORMATION TECHNOLOGY

CMR TECHNICAL CAMPUS
UGC AUTONOMOUS
(Accredited by NAAC, NBA, Permanently Affiliated to JNTUH, Approved by AICTE, New
Delhi) Recognized Under Section 2(f) & 12(B) of the UGC Act.1956,
Kandlakoya(V), Medchal Road, Hyderabad-501401.
2020-2024
DEPARTMENT OF INFORMATION TECHNOLOGY
CERTIFICATE
This is to certify that the project entitled “Data Poisoning Attacks on Federated
Machine Learning" being submitted by S. Devesh Kumar(207R1A1251), B. Vishal Adithya
(217R5A1204), G. Surakshitha (207R1A1217), T. Vamshi Krishna (207R1A1255) in partial
fulfilment of the requirements for the award of the degree of B. Tech in Information Technology
of the Jawaharlal Nehru Technological University Hyderabad, is a record of the bonafide work
carried out by them under our guidance and supervision during the year 2023-2024.
The results embodied in this thesis have not been submitted to any other university
or institute for the award of any degree or diploma
Mr. K. Srinu Dr. A. Raji Reddy

(Assistant Professor) DIRECTOR
INTERNAL GUIDE
EXTERNAL
Dr. B. Kavitha Rani EXAMINER
HEAD OF THE DEPARTMENT
Submitted for viva voce Examination held on

ACKNOWLEDGEMENT
Apart from the efforts of us, the success of any project depends largely on the
encouragement and guidelines of many others. We take this opportunity to express our
gratitude to the people who have been instrumental in the successful completion of this
project.
We take this opportunity to express my profound gratitude and deep regard to my
guide Mr. K. Srinu, Assistant Professor, for his exemplary guidance, monitoring, and
constant encouragement throughout the project work.
We also take this opportunity to express a deep sense of gratitude to Project
Review Committee (PRC) Coordinator: Mr. MD. Sajid Pasha, Associate Professor,
for their cordial support, valuable information, which helped us in completing this task
through various stages.
We are also thankful to the Head of the Department Dr. B. Kavitha Rani for
providing excellent infrastructure and a nice atmosphere for completing this project
successfully.
We would like to express our sincere gratitude to Dr. M. Ahmed Ali Baig, Dean
Administration, Dr. DTV. Dharmajee Rao, Dean Academics, Dr. Ashutosh Saxena,
Dean of R&D for encouragement throughout the course of this presentation.
We are obliged to our Director Dr. A. Raji Reddy for being cooperative
throughout the course of this project.
We would like to express our sincere gratitude to our Management of CMR
Technical Campus, Hyderabad, Sri. Ch. Gopal Reddy, Honourable Chairman, Smt
C. Vasantha Latha, Honourable Secretary, Sri. C. Abhinav Reddy, Honourable Chief
Executive Officer.
The guidance and support received from all the members of CMR TECHNICAL
CAMPUS who contributed and who are contributing to this project was vital for the
success of the project. We are grateful for their constant support and help.
Finally, we would like to take this opportunity to thank our family for their
constant encouragement without which this assignment would not be possible.
S. Devesh Kumar (207R1A1251)

B. Vishal Adithya (217R5A1204)
G. Surakshitha (207R1A1217)
T.Vamshi Krishna(207R1A1255)
ABSTRACT
Federated machine learning which enables resource constrained node devices (e.g.,
mobile phones and IoT devices) to learn a shared model while keeping the training data
local, can provide privacy, security and economic benefits by designing an effective
communication protocol. However, the communication protocol amongst different
nodes could be exploited by attackers to launch data poisoning attacks, which has been
demonstrated as a big threat to most machine learning models.
In this paper, we attempt to explore the vulnerability of federated machine learning.
More specifically, we focus on attacking a federated multi-task learning framework,
which is a federated learning framework via adopting a general multi-task learning
framework to handle statistical challenges. We formulate the problem of computing
optimal poisoning attacks on federated multi-task learning as a bilevel program that is
adaptive to arbitrary choice of target nodes and source attacking nodes.
Then we propose a novel systems-aware optimization method, ATTack on Federated
Learning (AT2FL), which is efficiency to derive the implicit gradients for poisoned
data, and further compute optimal attack strategies in the federated machine learning.
Our work is an earlier study that considers issues of data poisoning attack for federated
learning. To the end, experimental results on real-world datasets show that federated
multi-task learning model is very sensitive to poisoning attacks, when the attackers
either directly poison the target nodes or indirectly poison the related nodes by
exploiting the communication protocol.
i
LIST OF FIGURES
FIGURE NO NAME OF THE FIGURE PAGE NO
Figure 5.1 Proposed Architecture 15
Figure 5.3.1 Use Case Diagram 17
Figure 5.3.2 Class Diagram 18
Figure 5.3.3 Sequence Diagram 18
Figure 5.3.4 Collaboration Diagram 19
Figure 5.3.5 Activity Diagram 20
ii
LIST OF SCREENSHOTS
SCREENSHOT NO SCREENSHOT NAME PAGE NO

Figure 7.1 Home page 26
Figure 7.2 Upload video 26

Figure 7.3 Video Loaded 27
Figure 7.4 Click on ‘Get real-time vehicle reading’ 27

Figure 7.5 New frames 28
Figure 7.6 Detecting vehicles 28
iii
INDEX
ABSTRACT ⅰ
LIST OF FIGURES ⅱ
LIST OF SCREENSHOTS ⅲ
1. INTRODUCTION 3
1.1 INTRODUCTION 4
1.2 PROBLEM STATEMENT 4
1.3 OBJECTIVES 5
1.2 LIMITATIONS 5
2. SYSTEM ANALYSIS 6
2.1 INTRODUCTION 7
2.2 EXISTING SYSTEM 7
2.3 DISADVANTAGES OF EXISTING SYSTEM 7
2.4 PROPOSED DESIGN 8
2.5 ADVANTAGES OF PROPOSED SYSTEM 8
3. SYSTEM STUDY 9
1
3.1 FEASIBILITY STUDY 0
1
3.1.1 ECONOMICAL FEASIBILITY 0
3.1.2 TECHNICAL FEASIBILITY 0
3.1.3 SOCIAL FEASIBILITY 11
1
CMRTC
4. SYSTEM ANALYSIS 12
4.1 REQUIREMENTS
13
4.2 HARDWARE REQUIREMENTS 13
5. SYSTEM DESIGN 14
5.1 INTRODUCTION 15
5.2 ARCHITECTURE 15
5.3 UNIFIED MODELING LANGUAGE 16
5.3.1 USECASE DIAGRAM 17
5.3.2 CLASS DIAGRAM 18
5.3.3 SEQUENCE DIAGRAM 18
5.3.4 COLLABORATION DIAGRAM 19
5.3. 5 ACTIVITY DIAGRAM 20
6. IMPLEMENTATION 21
6.1 MODULES 22
6.2 MODULE DESCRIPTION 22
6.3 SOURCE CODE 23
7. SCREENSHOTS 25
8.TESTING 29
8.1 INTRODUCTION 30
8.2 FEATURES TO BE TESTED 32
9. CONCLUSION & FUTURE SCOPE 34

35
9.1 PROJECT CONCLUSION
35
9.2 FUTURE SCOPE
10. BIBLIOGRAPHY 36
10.1 REFERENCES 37
2
CMRTC
1.INTRODUCTION
3
CMRTC
1.1 INTRODUCTION
Machine learning has been widely-applied into a broad array of applications,
e.g., spam filtering and natural gas price prediction. Among these applications, the reliability
or security of the machine learning system has been a great concern, including adversaries. For
example, for product recommendation system, researchers can either rely on public crowd-
sourcing platform, e.g., Amazon Mechanical Turk or Taobao, or private teams to collect
training datasets. However, both of these above methods have the opportunity of being injected
corrupted or poisoned data by attackers. To improve the robustness of real-world machine
learning systems, it is critical to study how well machine learning performs under the poisoning
attacks. For the attack strategy on machine learning methods, it can be divided into two
categories: causative attacks and exploratory attacks, where exploratory attacks influence
learning via controlling over training data, and exploratory attacks can take use of
misclassifications without affecting training.
1.2 PROBLEM STATEMENT

The project focuses on investigating data poisoning attacks within Federated
Machine Learning (FML) systems, where multiple parties collaboratively train a global
model while keeping their data decentralized. The primary objectives include
understanding various forms of data poisoning attacks specific to FML environments,
developing effective detection and migration strategies to safeguard against these
attacks, evaluating the impact of such attacks on model performance metrics, and
ensuring scalability and efficiency of proposed defence mechanisms. By addressing
these challenges, the project aims to enhance the security and reliability of FML systems,
enabling their widespread adoption across diverse domains while preserving the privacy
and integrity of participants data.
4
CMRTC
1.3 OBJECTIVES
Objectives for the project are,
• Classify data poisoning attacks in federated machine learning, distinguishing
between conventional and FML-specific strategies.
• Develop detection mechanisms to identify malicious data injected during
decentralized training in FML systems, utilizing anomaly detection and
adversarial example analysis.
• Implement mitigation strategies to counteract data poisoning, including resilient
aggregation algorithms and cryptographic protocols, ensuring model integrity.
• Evaluate the impact of data poisoning on FML model performance, considering
accuracy, convergence rate, and susceptibility to adversarial manipulation, while
ensuring scalability and efficiency of proposed defenses.
1.4 LIMITATIONS
Limitations for the project are,
• Data Availability: The effectiveness of proposed detection and mitigation strategies

may be limited by the availability and diversity of training data from decentralized
participants, which could impact the robustness of the defense mechanisms.
• Adversarial Knowledge: The project's effectiveness in countering sophisticated data

poisoning attacks may be constrained by the level of adversarial knowledge possessed
by attackers, as they may adapt their strategies to evade detection and mitigation
techniques.
• Computational Overhead: The implementation of defense mechanisms such as

resilient aggregation algorithms and cryptographic protocols may introduce
computational overhead, potentially impacting the scalability and efficiency of
federated machine learning systems, particularly in resource-constrained environments.
• Generalization: The findings and recommendations generated from the project may be
specific to certain types of data poisoning attacks or FML configurations, limiting their
generalizability across diverse domains and scenarios
5
CMRTC
2. SYSTEM ANALYSIS
6
CMRTC
2. SYSTEM ANALYSIS
2.1 INTRODUCTION
Federated Machine Learning (FML) enables collaborative model training

across decentralized environments but faces security risks like data poisoning attacks.
These attacks aim to compromise model integrity by injecting malicious data during
training. Understanding and mitigating data poisoning in FML systems are crucial for
ensuring model reliability. This project investigates data poisoning attacks within
FML, focusing on identifying attack vectors, developing detection and mitigation
strategies, and evaluating their efficacy. The goal is to enhance the security of FML
systems, enabling their adoption across diverse domains while safeguarding data
privacy and integrity.
2.2 EXISTING SYSTEM

For the data poisoning attacks, it has become an urgent research field in the adversarial machine
learning, in which the target is against machine learning algorithms. The earlier attempt that
investigates the poisoning attacks on support vector machines (SVM), where the adopted attack
uses a gradient ascent strategy in which the gradient is obtained based on properties of the
SVM’s optimal solution.
The vulnerability of multi-task learning. However, the motivations for and our work are
significantly different as follows, the data sample in are put together, which is different from
the scenario in federated machine learning, i.e., machine learning models are built based on
datasets, that are distributed across multiple nodes/devices while preventing data leakage.
2.3 DISADVANTAGES OF EXISTING SYSTEM
The main disadvantages of existing system are,
• No Specific Model: The system doesn’t have data poisoning attack model on federated
machine learning.
• No Data Integrity: There is no technique called Data Integrity Check on data
poisoning attacks.
7
CMRTC
2.4 PROPOSED SYSTEM
The system proposes a bilevel optimization framework to compute optimal poisoning

attacks on federated machine learning. To our best knowledge, this is an earlier
attempt to explore the vulnerability of federated machine learning from the
perspective of data poisoning.
The proposed system derives an effective optimization method, i.e., Attack on
Federated Learning (AT2FL), to solve the optimal attack problem, which can address
systems challenges associated with federated machine learning.
The proposed system demonstrates the empirical performance of our optimal attack
strategy, and our proposed AT2FL algorithm with several real-world datasets. The
experiment results indicate that the communication protocol among multiple nodes
opens a door for attacker to attack federated machine learning.
2.5 ADVANTAGES OF PROPOSED SYSTEM

Advantages of the Proposed System are,
• Defence Mechanisms:
o Integrates defence strategies specifically designed to address the decentralized
nature and unique challenges of federated machine learning environments.
• Enhanced Adaptability:
o Adaptable to the dynamic nature of federated learning settings, ensuring robust
protection against evolving data poisoning attacks.
• Comprehensive Coverage:
o Provides comprehensive coverage across various machine learning models
commonly employed in federated settings, ensuring a holistic approach to
defending against attacks.
• Scalability and Practical Applicability:
o Designed with scalability in mind, enabling practical deployment in real-world
scenarios with large-scale and evolving datasets while maintaining effectiveness
against sophisticated attacks.
8
CMRTC
3.SYSTEM STUDY
9
CMRTC
3. SYSTEM STUDY
3.1 FEASIBILITY STUDY

The feasibility of the project is analyzed in this phase and business proposal is put forth with a
very general plan for the project and some cost estimates. During system analysis the feasibility
study of the proposed system is to be carried out. This is to ensure that the proposed system is
not a burden to the company. For feasibility analysis, some understanding of the major
requirements for the system is essential.
Three key considerations involved in the feasibility analysis are:
3.1.1 ECONOMICAL FEASIBILITY
This study is carried out to check the economic impact that the system will
have on the organization. The amount of fund that the company can pour into the
research and development of the system is limited. The expenditures must be justified.
Thus the developed system as well within the budget and this was achieved because
most of the technologies used are freely available. Only the customized products had
to be purchased.
3.1.2 TECHNICAL FEASIBILITY

This study is carried out to check the technical feasibility, that is, the
technical requirements of the system. Any system developed must not have a high
demand on the available technical resources. This will lead to high demands on the
available technical resources. This will lead to high demands being placed on the
client. The developed system must have a modest requirement, as only minimal or
null changes are required for implementing this system.
10
CMRTC
3.1.3 SOCIAL FEASIBILITY
The aspect of study is to check the level of acceptance of the system by

the user. This includes the process of training the user to use the system efficiently.
The user must not feel threatened by the system, instead must accept it as a necessity.
The level of acceptance by the users solely depends on the methods that are employed
to educate the user about the system and to make him familiar with it. His level of
confidence must be raised so that he is also able to make some constructive criticism,
which is welcomed, as he is the final user of the system.
11
CMRTC
4.SYSTEM REQUIREMENTS
12
CMRTC
4.SYSTEM REQUIREMENTS
4.1 SOFTWARE REQUIREMENTS

Operating system : Windows 10 or above
Programming : Python (3.6.3)
Language
IDE : Visual Studio Code
Backend : Django
Frontend : HTML, CSS, JavaScript
Database : MySQL
4.2 HARDWARE REQUIREMENTS:
System : Intel i5 Processor
Hard Disk : 512GB SSD or above.
RAM : 8GB.
13
CMRTC
5.SYSTEM DESIGN
14
CMRTC
5. SYSTEM DESIGN
5.1 INTRODUCTION
Architecture defines the components, modules interfaces and data for a system
to satisfy specified requirements. One should see as the applications of the systems
theory to product development.
5.2 ARCHITECTURE
FIGURE 5.1 PROPOSED ARCHITECTURE
15
CMRTC
5.3 UNIFIED MODELING LANGUAGE (UML)
UML stands for Unified Modelling Language. UML is a standardized general

purpose modelling language in the field of object-oriented software engineering. The
standard is managed, and was created by, the Object Management Group.
GOALS OF UML:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modelling Language so that they

can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core concepts.
3. Be independent of particular programming languages and development process.
4. Provide a formal basis for understanding the modelling language.
16
CMRTC
5.3.1 USE CASE DIAGRAM
A use case diagram in the Unified Modelling Language (UML) is a type of behavioural
diagram defined by and created from a Use-case analysis. Its purpose is to present a graphical
overview of the functionality provided by a system in terms of actors, their goals (represented
as use cases), and any dependencies between those use cases. The main purpose of a use case
diagram is to show what system functions are performed for which actor. The roles of the
actors in the system can be depicted.
FIGURE 5.2 USE CASE DIAGRAM
17
CMRTC
5.3.2 CLASS DIAGRAM

In software engineering, a class diagram in the Unified Modelling Language
(UML) is a type of static structure diagram that describes the structure of a system by
showing the system's classes, their attributes, operations (or methods), and the
relationships among the classes. It explains which class contains information.
FIGURE 5.3 CLASS DIAGRAM
18
CMRTC
5.3.3 SEQUENCE DIAGRAM

A sequence diagram in Unified Modelling Language (UML) is a kind of
interaction diagram that shows how processes operate with one another and in what
order. It is a construct of a Message Sequence Chart. Sequence diagrams are
sometimes called event diagrams, event scenarios, and timing diagrams.
FIG 5.4: SEQUENCE DIAGRAM
19
CMRTC
5.3.4 COLLABORATION DIAGRAM:

In the Unified Modelling Language, a collaboration diagram depicts how
components are wired together to form larger components and or software systems.
They are used to illustrate the structure of arbitrarily complex systems.
Components are wired together by using an assembly connector to connect the
required interface of one component with the provided interface of another
component. This illustrates the service consumer - service provider relationship
between the two components.
FIG:5.5 COLLABORATION DIAGRAM
20
CMRTC
5.3.5 ACTIVITY DIAGRAM:

Activity diagrams are graphical representations of workflows of stepwise activities and
actions with support for choice, iteration and concurrency. In the Unified Modeling Language,
activity diagrams can be used to describe the business and operational step-by-step workflows
of components in a system. An activity diagram shows the overall flow of control.
FIG5.6 ACTIVITY DIAGRAM
21
CMRTC
6.IMPLEMENTATION
22
CMRTC
6. IMPLEMENTATION
6.1 MODULES
Module.
1. Bi-Level Optimization Module.
2. Attack to Federated Learning (AT2FL) Module.
3. Optimal Attack Strategy Module.
6.2 MODULES DESCRIPTION
1. Bi-Level Optimization Module: We propose a bilevel optimization framework

to compute optimal poisoning attacks on federated machine learning. To our
best knowledge, this is an earlier attempt to explore the vulnerability of
federated machine learning from the perspective of data poisoning.
2. AT2FL Module: We derive an effective optimization method, i.e., ATTack on

Federated Learning (AT2FL), to solve the optimal attack problem, which can
address systems challenges associated with federated machine learning.
3. Optimal Attack Strategy Module: We demonstrate the empirical performance

of our optimal attack strategy, and our proposed AT2FL algorithm with several
real-world datasets. The experiment results indicate that the communication
protocol among multiple nodes opens a door for attacker to attack federated
machine learning.
23
CMRTC
6.3 SOURCE CODE:
pip install numpy==1.18.1

pip install matplotlib==3.1.3
pip install pandas==0.25.3
pip install opencv-python==4.2.0.32
pip install keras==2.3.1
pip install tensorflow==1.14.0
pip install h5py==2.10.0
pip install pillow==7.0.0
pip install sklearn-genetic==0.2
pip install SwarmPackagePy
pip install sklearn
pip install scikit-learn==0.22.2.post1
Pip install sklearn-extensions==0.0.2
Pip install pyswarms==1.1.0
pip install protobuf==3.20.0 --user
pip install nltk
pip install django==2.1.7

pip install pymysql==0.9.3
pip install matplotlib==3.1.3 (used for display Graph)

pip install pandas==0.25.3 (Used to read dataset)
pip install opencv-python (used for image reading)
pip install keras==2.3.1 (used for neural Network Implementation)
pip install tensorflow==1.14.0 (used for CNN implementation)
pip install h5py==2.10.0 (used for support to tensorflow and keras libraries)
pip install sklearn (used for Machine learning Algorithms implementation like
Decision tree,randomforest Tree,etc)
pip install --only-binary :all: mysqlclient --user
pip install mysqlclient --user
sc delete mysql
import pymysql
pymysql.install_as_MySQLdb()
predict = model.predict_classes(test)
pip install --user -U nltk

python
>>> import nltk
24
CMRTC
>>> nltk.download()
pip install -r requirements.txt
global filename
text.delete('1.0', END)
filename = filedialog.askopenfilename(initialdir="dataset")
dataset = pd.read_csv(filename)
,on_delete=models.CASCADE,
python -m pip install –-user -r requirements.txt
3.6.2 python
django==1.11.6
mysqlclient==1.3.12
<?xml version="1.0" encoding="UTF-8"?>
<module type="PYTHON_MODULE" version="4">
<component name="NewModuleRootManager">
<content url="file://$MODULE_DIR$">
<excludeFolder url="file://$MODULE_DIR$/venv" />
</content>
<orderEntry type="inheritedJdk" />
<orderEntry type="sourceFolder" forTests="false" />
</component>
<component name="TestRunnerService">
<option name="PROJECT_TEST_RUNNER" value="Unittests" />
</component>
</module>
<project version="4">
<component name="ProjectRootManager" version="2" project-jdk-name="Python
3.6 (venv) (129)" project-jdk-type="Python SDK" />
</project>

<component name="ProjectModuleManager">
<modules>
<module fileurl="file://$PROJECT_DIR$/.idea/Malware_Detection.iml"
filepath="$PROJECT_DIR$/.idea/Malware_Detection.iml" />
</modules>
</component>
</project>
<entry
</content>
</component>
25
CMRTC
</component>
</module>
<component name="ProjectModuleManager">
<modules>
<module fileurl="file://$PROJECT_DIR$/.idea/Malware_Detection.iml"
filepath="$PROJECT_DIR$/.idea/Malware_Detection.iml" />
</modules>
</component>
</project>
<entry
</content>
</component>
</component>
</module>
</content>
</component>
</component>
</module>
</xml>
26
CMRTC
7.SCREENSHOTS
27
CMRTC
7. SCREENSHOTS
7.1 Starting the Server:
28
CMRTC
7.2 Login & Registration Page:
29
CMRTC
7.3 Accuracy Score’s:
30
CMRTC
7.4 Visualization of Output:
31
CMRTC
8.TESTING
32
CMRTC
8. TESTING
The purpose of testing is to discover errors. Testing is the process of trying to
discover every conceivable fault or weakness in a work product. It provides a way
to check the functionality of components, sub-assemblies, assemblies and/or a
finished product it is the process of exercising software with the intent of ensuring
that the Software system meets its requirements and user expectations and does
not fail unacceptably. There are various types of test. Each test type addresses a
specific testing requirement.
8.1 TYPES OF TESTS
8.1.1 Unit testing

Unit testing involves the design of test cases that validate that the
internal program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is the
testing of individual software units of the application .it is done after the completion
of an individual unit before integration. This is a structural testing, that relies on
knowledge of its construction and is invasive. Unit tests perform basic tests at the
component level and test a specific business process, application, and/or system
configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and
expected results.
8.1.2 Integration testing

Integration tests are designed to test integrated software components to
determine if they actually run as one program. Testing is event-driven and is more
concerned with the basic outcome of screens or fields. Integration tests demonstrate that
although the components were individually satisfied, as shown by successful unit testing,
the combination of components is correct and consistent.
33
CMRTC
8.1.3 Functional test

Functional tests provide systematic demonstrations that functions tested are
available as specified by the business and technical requirements, system
documentation, and user manuals.
Functional testing is centered on the following items:
Valid Input : identified classes of valid input must be accepted.

Invalid Input : identified classes of invalid input must be rejected.
Functions : identified functions must be exercised.
Output : identified classes of application outputs must be exercised.
Systems/Procedures : interfacing systems or procedures must be invoked.
Organization and preparation of functional tests is focused on

requirements, key functions, or special test cases. In addition, systematic coverage
pertaining to identifying Business process flows; data fields, predefined processes,
and successive processes must be considered for testing. Before functional testing is
complete, additional tests are identified and the effective value of current tests is
determined.
8.1.4 System Test
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration-oriented system integration test.
System testing is based on process descriptions and flows, emphasizing pre-driven
process links and integration points.
8.1.5 White Box Testing

White Box Testing is a testing in which in which the software tester has knowledge
of the inner workings, structure, and language of the software, or at least its purpose.
It is purpose. It is used to test areas that cannot be reached from a black box level.
34
CMRTC
8.1.6 Black Box Testing

Black Box Testing is testing the software without any knowledge of the
inner workings, structure or language of the module being tested. Black box tests, as
most other kinds of tests, must be written from a definitive source document, such as
specification or requirements document, such as specification or requirements
document.
It is a testing in which the software under test is treated, as a black box You cannot
“see” into it. The test provides inputs and responds to outputs without considering
how the software works.
8.1.7 Unit Testing

Unit testing is usually conducted as part of a combined code and unit
test phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.
8.2 Test strategy and approach
Test objectives
Field testing will be performed manually and functional tests will be

written in detail.
• All field entries must work properly.
• Pages must be activated from the identified link.
• The entry screen, messages and responses must not be delayed.
Features to be tested
• Verify that the entries are of the correct format
• No duplicate entries should be allowed
35
CMRTC
8.2.1 Integration Testing

Software integration testing is the incremental integration testing of two
or more integrated software components on a single platform to produce failures
caused by interface defects.
The task of the integration test is to check that components or software applications,
e.g. components in a software system or – one step up – software applications at the
company level – interact without error.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
8.2.2 Acceptance Testing

User Acceptance Testing is a critical phase of any project and requires significant
participation by the end user. It also ensures that the system meets the functional
requirements.
Test Results: All the test cases mentioned above passed successfully. No defects
encountered.
36
CMRTC
9. CONCLUSION
&
FUTURE SCOPE
37
CMRTC
9. CONCLUSION & FUTURE SCOPE
9.1 PROJECT CONCLUSION

In conclusion, this project has addressed the pressing issue of data poisoning attacks in
Federated Machine Learning (FML) systems. Through thorough investigation and analysis, we
have identified vulnerabilities inherent in decentralized learning environments and developed
specialized defense mechanisms to mitigate the risks associated with such attacks. By
recognizing the unique challenges posed by FML systems, we have proposed effective
strategies to safeguard model integrity and reliability. Our findings emphasize the significance
of combatting data poisoning attacks to ensure the security and trustworthiness of collaborative
machine learning endeavors across diverse domains.
9.2 FURTHER ENHANCEMENT
Moving forward, several avenues for future research and development emerge in the domain
of data poisoning attacks on federated machine learning. These include:
• Enhanced Detection Techniques: Further refinement and exploration of detection

mechanisms to identify subtle instances of data poisoning attacks, utilizing advanced
anomaly detection and adversarial example analysis techniques.
• Robustness Across Models: Extending defense strategies to encompass a broader

range of machine learning models common in federated settings, ensuring
comprehensive protection against diverse attack vectors.
• Scalability and Efficiency: Optimization of defense mechanisms for scalability and

efficiency, enabling practical deployment in real-world federated learning scenarios
with large-scale and dynamic datasets.
• Privacy-Preserving Solutions: Exploration of privacy-preserving methodologies to

mitigate the risk of data leakage and privacy breaches in federated learning systems
while defending against data poisoning attacks.
38
CMRTC
10. BIBLIOGRAPHY
39
CMRTC
10. BIBLIOGRAPHY
10.1 REFERENCES
1. Biggio, B., & Roli, F. (2018). Wild patterns: Ten years after the rise of adversarial
machine learning. Pattern Recognition, 84, 317-331.
2. Steinhardt, J., Koh, P. W., Liang, P., & Feizi, S. (2017). Certified defenses against
adversarial examples. arXiv preprint arXiv:1705.07263.
3. Bhagoji, A. N., He, W., Li, B., & Song, D. (2018). Exploring the space of black-box
attacks on deep neural networks. arXiv preprint arXiv:1712.09491.
4. Bagdasaryan, E., Veit, A., Hua, Y., Estrin, D., & Shmatikov, V. (2018). How to
backdoor federated learning. arXiv preprint arXiv:1807.00459.
5. Nasr, M., Shokri, R., Houmansadr, A., & Gehrke, J. (2019). Comprehensive privacy
analysis of deep learning: Stand-alone and federated learning under passive and active
white-box inference attacks. arXiv preprint arXiv:1909.02605.
6. Liu, Y., Ma, S., Arai, M., & Masuda, H. (2019). Data poisoning attacks on federated
learning based recommender systems. arXiv preprint arXiv:1908.08311.
7. Kairouz, P., McMahan, H. B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A. N., ... &
Vinayakumar, R. (2019). Advances and open problems in federated learning. arXiv
preprint arXiv:1912.0497
40
CMRTC
41
CMRTC

Major Doc Adhooora

Uploaded by

Copyright:

Available Formats

Major Doc Adhooora

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Major Doc Adhooora

Uploaded by

Copyright:

Available Formats

A

Data Poisoning Attacks on Federated Machine Learning

T.Vamshi Krishna (Regd.No:207R1A1255)

Under the Guidance of

DEPARTMENT OF INFORMATION TECHNOLOGY

DEPARTMENT OF INFORMATION TECHNOLOGY

Mr. K. Srinu Dr. A. Raji Reddy

Submitted for viva voce Examination held on

S. Devesh Kumar (207R1A1251)

FIGURE NO NAME OF THE FIGURE PAGE NO

Figure 5.1 Proposed Architecture 15

Figure 5.3.1 Use Case Diagram 17

Figure 5.3.2 Class Diagram 18

Figure 5.3.3 Sequence Diagram 18

Figure 5.3.4 Collaboration Diagram 19

Figure 5.3.5 Activity Diagram 20

SCREENSHOT NO SCREENSHOT NAME PAGE NO

Figure 7.2 Upload video 26

Figure 7.4 Click on ‘Get real-time vehicle reading’ 27

1.2 PROBLEM STATEMENT 4

2.2 EXISTING SYSTEM 7

2.3 DISADVANTAGES OF EXISTING SYSTEM 7

2.4 PROPOSED DESIGN 8

2.5 ADVANTAGES OF PROPOSED SYSTEM 8

3.1.2 TECHNICAL FEASIBILITY 0

3.1.3 SOCIAL FEASIBILITY 11

5.3 UNIFIED MODELING LANGUAGE 16

5.3.1 USECASE DIAGRAM 17

5.3.2 CLASS DIAGRAM 18

5.3.3 SEQUENCE DIAGRAM 18

5.3.4 COLLABORATION DIAGRAM 19

5.3. 5 ACTIVITY DIAGRAM 20

6.2 MODULE DESCRIPTION 22

6.3 SOURCE CODE 23

8.2 FEATURES TO BE TESTED 32

9. CONCLUSION & FUTURE SCOPE 34

1.2 PROBLEM STATEMENT

• Data Availability: The effectiveness of proposed detection and mitigation strategies

• Adversarial Knowledge: The project's effectiveness in countering sophisticated data

• Computational Overhead: The implementation of defense mechanisms such as

Federated Machine Learning (FML) enables collaborative model training

2.2 EXISTING SYSTEM

2.3 DISADVANTAGES OF EXISTING SYSTEM

The main disadvantages of existing system are,

2.4 PROPOSED SYSTEM

The system proposes a bilevel optimization framework to compute optimal poisoning

2.5 ADVANTAGES OF PROPOSED SYSTEM

3.1 FEASIBILITY STUDY

Three key considerations involved in the feasibility analysis are:

3.1.1 ECONOMICAL FEASIBILITY

3.1.2 TECHNICAL FEASIBILITY

3.1.3 SOCIAL FEASIBILITY

The aspect of study is to check the level of acceptance of the system by

4.1 SOFTWARE REQUIREMENTS

4.2 HARDWARE REQUIREMENTS:

System : Intel i5 Processor

Hard Disk : 512GB SSD or above.

FIGURE 5.1 PROPOSED ARCHITECTURE

5.3 UNIFIED MODELING LANGUAGE (UML)

UML stands for Unified Modelling Language. UML is a standardized general