Smart Surveillence Report
Smart Surveillence Report
Smart Surveillence Report
It is a great pleasure for us to acknowledge the assistance and support of a large number
of individuals who have been responsible for the successful completion of this project work.
First, we take this opportunity to express our sincere gratitude to Faculty of
Engineering & Technology, Jain (Deemed-to-be University), for providing us with a great
opportunity to pursue our Bachelor’s Degree in this institution.
We place on record, our sincere thank you to Dr. Geetha G, Director, School of CSE
Faculty of Engineering & Technology, Jain (Deemed-to-be University), for the
continuous encouragement.
It is a matter of immense pleasure to express our sincere thanks to Dr. Rajesh A,
Professor and HOD, Department of Computer Science & Engineering, Jain (Deemed-
to-be University), for providing right academic guidance that made our task possible.
It is a matter of immense pleasure to express our sincere thanks to Prof.
Narasimhayya B E, Program Coordinator of the Department, Computer Science &
Engineering, Jain (Deemed-to-be University), for providing right academic guidance that
made our task possible.
We would like to thank our guide Ms. Sunena Rose M V, Assistant Professor, Dept.
of Computer Science & Engineering, Jain (Deemed-to-be University), for sparing his
valuable time to extend help in every step of our project work, which paved the way for
smooth progress and fruitful culmination of the project.
We are also grateful to our family and friends who provided us with every requirement
throughout the course.
We would like to thank one and all who directly or indirectly helped us in completing
the Project work successfully.
Signature of Students
iv
TABLE OF CONTENTS
Page No.
CERTIFICATE ii
DECLARATION iii
ACKNOWLEDGEMENT iv
TABLE OF CONTENTS v
ABSTRACT vii
LIST OF FIGURES viii
NOMENCLATURE USED ix
Chapter 1
1. INTRODUCTION
1.1. Overview 1
1.3. Objectives 1
1.4. Methodology 3
Chapter 2
2. LITERATURE SURVEY
Chapter 3
3. METHODOLOGY
3.1. Architecture 12
v
Chapter 4
4. TOOL DESCRIPTION
Chapter 5
5. IMPLEMENTATION 18
Chapter 6
6. RESULTS AND ANALYSIS
Chapter 7
7. CONCLUSIONS AND FUTURE SCOPE
30
7.1 Conclusion
31
REFERENCES
33
APPENDIX
vi
ABSTRACT
The increasing availability and advancement of digital surveillance systems have underscored
the critical need for efficient and accurate weapon detection solutions to enhance public
safety and security. This abstract presents an innovative approach for real-time weapon
detection using the YOLOv8 (You Only Look Once version 8) deep learning framework.This
proposed system leverages the capabilities of convolutional neural networks (CNNs) to
automatically detect and classify weapons in real-time video streams. By employing a single-
stage object detection architecture, YOLOv8 achieves impressive accuracy and efficiency in
real-time applications. The weapon detection pipeline involves various stages, such as data
collection, preprocessing, model training, and real-time inference. A comprehensive dataset,
comprising a diverse range of firearm images, is carefully curated to train the YOLOv8
model. Transfer learning techniques are employed to fine-tune the pretrained model
specifically for detecting different types of weapons. During the inference stage, the
YOLOv8 model analyzes video frames in real-time, utilizing parallel computation to achieve
high processing throughput. By examining the spatial relationships and contextual
information within the frames, the model accurately localizes and classifies weapons,
providing real-time alerts for potential threats.
Keywords: Weapon detection, YOLOv8, real-time, deep learning, object detection, public
safety, surveillance systems.
vii
LIST OF FIGURES
Fig. No. Description of Figure Page No.
viii
NOMENCLATURE USED
ix
Chapter 1
INTRODUCTION
1.1. Overview
This project focuses on the development of a real-time weapon detection system using the
YOLOv8 deep learning framework. The objective is to leverage the power of convolutional
neural networks (CNNs) to automatically detect and classify weapons in live video streams,
enhancing public safety and security in various settings.
The project encompasses multiple stages, starting with data collection. A diverse dataset
containing firearm images is curated to train the YOLOv8 model specifically for weapon
detection. Transfer learning techniques are applied to fine-tune the pretrained model, enabling
it to identify various types of weapons accurately.
Overall, the project aims to contribute to public safety by leveraging deep learning techniques
to automate weapon detection, reducing manual effort and response time in critical situations.
Traditional methods of weapon detection often rely on manual inspection, which is time-
consuming, labor-intensive, and prone to human error. To overcome these limitations, an
automated solution using deep learning techniques is proposed. The goal is to develop a real-
time weapon detection system based on the YOLOv8 framework, capable of accurately
identifying and classifying weapons in live video feeds.
1.3. Objectives
Develop a real-time weapon detection system: The primary objective of this project is to design
and implement a robust real-time weapon detection system using the YOLOv8 deep learning
1
framework. The system should be capable of analyzing live video streams and promptly
detecting weapons with high accuracy.
i)Curate a diverse weapon dataset: Collect and curate a comprehensive dataset containing a
wide range of firearm images and other potentially threatening objects. This dataset will be
used for training the YOLOv8 model to accurately identify and classify different types of
weapons.
ii)Train and fine-tune the YOLOv8 model: Utilize the curated dataset to train the YOLOv8
model, leveraging transfer learning techniques to adapt the pretrained model to the specific
task of weapon detection. Fine-tune the model to achieve high accuracy, precision, and recall
rates for weapon identification.
2
its usability, reliability, and scalability in detecting weapons and providing real-time alerts for
potential threats.
By achieving these objectives, the project aims to contribute to enhancing public safety and
security by providing an efficient and accurate real-time weapon detection system that can be
deployed in various settings requiring threat detection and mitigation.
1.4. Methodology
1.Data Collection and Preparation:
• Gather a diverse dataset of firearm images and other potentially threatening objects.
Include variations in lighting conditions, angles, and occlusions to enhance the
model's robustness.
• Annotate the dataset with bounding boxes indicating the location of weapons in each
image.
• Split the dataset into training, validation, and testing sets.
• Choose YOLOv8 as the base object detection framework due to its real-time processing
capabilities and high accuracy.
• Adapt the YOLOv8 architecture for weapon detection by modifying the output classes
to represent different weapon categories.
• Initialize the YOLOv8 model with pretrained weights from a large-scale dataset, such
as COCO or ImageNet.
• Perform transfer learning by freezing the initial layers and fine-tuning the remaining
layers on the weapon detection dataset.
• Utilize optimization techniques such as gradient descent and adaptive learning rate
schedules to train the model.
3
4.Model Evaluation and Hyperparameter Tuning:
• Evaluate the trained model on the validation set using metrics like precision, recall, and
mean Average Precision (mAP).
• Conduct hyperparameter tuning, adjusting parameters such as learning rate, batch size,
and anchor sizes, to optimize the model's performance.
5.Real-time Inference:
• Implement the trained YOLOv8 model for real-time weapon detection on video
streams.
• Utilize techniques like frame sampling, parallel computation, and hardware
acceleration (e.g., GPUs) to ensure real-time processing.
6.Performance Evaluation:
• Evaluate the real-time weapon detection system on the testing set, measuring key
performance metrics like precision, recall, and accuracy.
• Analyze the system's performance in different scenarios, including varying lighting
conditions, occlusions, and camera angles.
7.Robustness Enhancement:
4
• Assess the usability, reliability, and scalability of the system in detecting weapons and
providing real-time alerts for potential threats.
Throughout the project, document the methodology, experimental setup, and findings to
facilitate reproducibility and future enhancements. Continuously iterate and refine the
approach based on evaluation results and feedback.
5
Chapter 2
LITERATURE SURVEY
2.1. Related Work
In this section, we briefly discuss about the previous work done on this project “Smart
Surveillance Using YOLO V8”.
related work in weapon detection using R-CNN, SSD, YOLOv3, and YOLOv5 methods
showcases the application of various deep learning techniques for detecting weapons in
surveillance videos. Here is an overview of the related work using these methods:
i) R-CNN:
Researchers have explored the use of R-CNN for weapon detection, where region proposal
techniques like selective search are employed to generate potential weapon regions. The
extracted regions are then classified using SVM or other classifiers. Limitations include slower
processing speed, inefficiency in handling scale and aspect ratios, and sensitivity to occlusions.
SSD has been utilized for real-time weapon detection, offering a one-stage approach that
predicts bounding boxes and class probabilities in a single pass. However, SSD may face
challenges in accurately detecting small-sized weapons and may have reduced precision
compared to other models.
YOLOv3 has been extensively used in weapon detection projects, leveraging its grid-based
approach for predicting bounding boxes and class probabilities directly. Researchers have
achieved real-time weapon detection with YOLOv3, although it may have limitations in
handling overlapping objects, small-sized weapons, and may require significant computational
resources.
6
iv) YOLOv5 (You Only Look Once v5):
YOLOv5, a lightweight and optimized version of YOLO, has been applied in real-time weapon
detection scenarios. It offers improved speed and accuracy compared to its predecessors.
However, YOLOv5 may have limitations in detecting small-sized weapons, sensitivity to
occlusions, and requires careful hyperparameter tuning.
These related works collectively demonstrate the application of deep learning techniques in
weapon detection, with varying emphasis on real-time processing, accuracy, and efficiency.
Each method has its advantages and limitations, and researchers have explored strategies to
address these limitations, such as incorporating data augmentation, optimizing network
architectures, and utilizing domain-specific datasets.
Overall, the related work provides valuable insights into the strengths and weaknesses of
different approaches, aiding in the selection and optimization of the most suitable method for
the proposed real-time weapon detection project.
In this[4] Object detection and tracking in real time is done using Single shot detection (SSD)
algorithm which is based on VGG-16 architecture. The implementation methodology is
complex here where they followed different methods for object detection like Frame
differencing, Optical flow and Background subtraction. Followed by object tracking using
sequence of detection or detection with dynamics.
This study[5] introduces an innovative model that leverages advanced models such as YOLO,
renowned for its high-speed detection capabilities. The primary focus of this research is to
address the issue of false positives and negatives in Weapon Detection. To achieve this, the
model incorporates Gaussian blur to eliminate background noise and emphasizes the region of
interest. Additionally, the combination of YOLOv5 and Stochastic Gradient Descent (SGD)
enhances the overall performance of the proposed approach.
7
The[6] introduction of YOLOv4 in the object detection domain brought significant
improvements in both accuracy and speed compared to its predecessors. While the YOLOv4
paper did not explicitly focus on weapon detection, its advancements have exerted a
noteworthy influence on subsequent research in this area. Researchers have capitalized on the
enhanced detection capabilities of YOLOv4 and customized it specifically for weapon
detection, leading to notable advantages.
In their study, Nguyen et al. (2020) proposed a method for effectively detecting weapons in
surveillance[7] videos by leveraging the Single Shot MultiBox Detector (SSD). They designed
a customized SSD model and trained it on an extensive dataset of annotated surveillance
videos. The results of their research demonstrated the successful application of SSD for real-
time weapon detection in surveillance camera footage.
In this study,[8] a novel gun detection system was proposed for security applications,
employing the YOLOv3 architecture. The researchers trained the model on an extensive dataset
consisting exclusively of gun images, placing emphasis on its focused nature. To enhance the
model's robustness, they incorporated a variety of data augmentation techniques. Through
rigorous experimentation, the system demonstrated outstanding accuracy and real-time
performance, establishing its suitability for real-world deployment.
Kim et al. [9] (2021) aimed to develop a deep learning model utilizing YOLOv4 for gun
detection in security systems. They collected a comprehensive dataset of gun images from
various sources and applied transfer learning techniques to fine-tune the YOLOv4 model. The
proposed model exhibited high accuracy and surpassed previous approaches in terms of
detection performance.
8
• Inefficient for real-time monitoring: Manual inspection is not suitable for real-time
monitoring of video streams, as it cannot process large amounts of video data in
real-time.
• Slow processing speed: Some machine learning approaches for weapon detection,
such as R-CNN or two-stage methods, may be computationally expensive and have
slower processing speeds, making them unsuitable for real-time applications.
• Limited scalability: Models with slow processing speeds may not scale well to
large-scale surveillance systems with multiple cameras or high-resolution video
streams.
9
• Limited generalization to unseen data: Many existing systems may have limited
generalization capabilities, as they may have been trained on specific datasets or
environments and may not perform well on unseen data or novel weapon types.
1.Real-time Processing: The system will be designed to operate in real-time, enabling the
detection of weapons in live video streams with minimal latency. This allows for proactive
threat identification and immediate response.
2.High Accuracy: By utilizing the YOLOv8 framework, the proposed system aims to achieve
high accuracy in weapon detection. The model will be trained on diverse datasets containing a
wide range of weapon types, orientations, scales, and occlusions, enhancing its detection
capabilities.
3.Efficient Architecture: YOLOv8 is optimized for speed and efficiency, making it suitable for
real-time applications. The proposed system will leverage the streamlined architecture to
process video frames quickly, enabling the analysis of high-resolution video streams in a timely
manner.
5.Robustness to Occlusions and Complex Backgrounds: The proposed system will incorporate
techniques to improve robustness to occlusions and complex backgrounds. This will involve
10
leveraging deep learning methods for feature extraction, utilizing advanced network
architectures, and incorporating data augmentation strategies.
6.User-friendly Interface and Alerts: The system will feature a user-friendly interface that
provides real-time visualization of the video feed and detected weapon regions. It will also
generate alerts or notifications to security personnel or relevant authorities when a potential
weapon is detected, enabling prompt response and threat mitigation.
By developing the proposed system, we aim to enhance public safety and security by providing
an automated, accurate, and real-time weapon detection solution. The system's adaptability,
efficiency, and robustness will enable its deployment in various surveillance scenarios,
contributing to proactive threat detection and efficient resource allocation.
11
Chapter 3
METHODOLOGY
3.1. Architecture
The weapon detection system aims to develop an automated solution for detecting
weapons in real-time using the YOLOv8m (You Only Look Once) algorithm. The
system plays a crucial role in enhancing public safety and security by leveraging deep
learning techniques to accurately and efficiently detect weapons in surveillance videos.
To achieve robustness and reliability, it is essential to create a diverse dataset of
annotated images and videos that encompass various weapon types, angles, lighting
conditions, and backgrounds. This diverse dataset serves as the foundation for training
and fine-tuning the weapon detection model.
i) Dataset Creation:
The dataset creation phase is a crucial step in building an effective weapon detection
system. To create a diverse dataset, it is important to capture annotated images and
videos featuring different types of weapons. These images and videos should be
captured from various angles, under different lighting conditions, and against different
backgrounds. This diversity ensures that the model can generalize well and accurately
detect weapons in a wide range of real-world scenarios. Careful consideration should
be given to dataset size and diversity to ensure that the system is robust and can handle
variations in weapon appearance.
ii)Data Pre-processing:
Once the dataset is collected, it needs to be pre-processed to prepare it for training the
weapon detection model. This involves resizing the images to a standardized size and
normalizing the pixel values to ensure consistency across the dataset. Additionally, data
augmentation techniques such as rotation, scaling, and flipping should be applied to
increase the dataset's size and introduce variations in object appearance. Splitting the
dataset into training, validation, and testing subsets is also crucial for effective model
training and evaluation.
12
iii)YOLOv8m Model Architecture:
The YOLOv8m algorithm is a popular choice for object detection, including weapon
detection. The model architecture follows a one-stage approach, allowing it to predict
object bounding boxes and class probabilities directly. To improve the model's
performance, it is advisable to pretrain it on a large-scale dataset like COCO or
ImageNet. This initial training provides a strong foundation for the model to learn
generic features. Next, the model should be fine-tuned using the collected weapon
dataset to specialize in weapon detection. Fine-tuning involves updating the model's
weights by minimizing a predefined loss function based on the annotated data.
iv)Model Evaluation:
Once the YOLOv8m model is trained, it needs to be evaluated to assess its performance.
The evaluation is typically performed using the validation subset of the dataset. Metrics
such as precision, recall, and mean average precision (mAP) are computed to measure
the model's accuracy and effectiveness in detecting weapons. Based on the evaluation
results, further iterations of fine-tuning and parameter adjustments can be conducted to
improve the model's performance.
vi)Threshold Adjustment:
To optimize the performance of the weapon detection system, the confidence thresholds
need to be fine-tuned. These thresholds determine the sensitivity of the system in
detecting weapons. Adjusting the thresholds allows finding the right balance between
avoiding false positives (detecting non-weapons as weapons) and detecting true
positives (correctly identifying weapons). It may require experimentation and iterative
adjustments to find the optimal thresholds for the specific use case and desired
performance trade-offs.
13
3.2. Sequence Diagram
To train and detect using YOLOv8, you need to start by collecting a dataset and annotating it
with labelled bounding boxes. Next, configure the YOLOv8 architecture and set appropriate
hyperparameters for training. Train the network using the annotated dataset to learn how to
detect objects. Once the model is trained, you can use it for object detection by providing
input images and processing the output bounding box predictions. YOLOv8 uses a single-
stage object detection approach, making it efficient and capable of real-time detection. It
combines high accuracy with fast inference times, making it a popular choice for various
computer vision applications.
14
Chapter 4
TOOL DESCRIPTION
15
and packages for performing common tasks, such as file I/O, networking, database
access, and more. The standard library greatly enhances Python's capabilities and
reduces the need for external dependencies.
• OpenCV: OpenCV stands for Open-Source Computer vision. It is an Open Source
BSD licensed library that includes hundreds of advanced computer vision algorithms
that are optimized to use hardware acceleration. OpenCV is commonly used for
machine learning, image processing, image manipulation and much more.
OpenCV has a modular structure. Image manipulation is easily performed in a few lines
of code using OpenCV.
• Google Colab: Colab is based on the popular Jupyter Notebook interface and supports
the creation of interactive notebooks. These notebooks consist of code cells that can be
executed individually, allowing for an interactive and iterative coding experience.
Users can write Python code, execute it, view the results, and add explanatory text or
visualizations in Markdown cells. One of the key advantages of Google Colab is its
provision of computing resources in the cloud. It offers both CPU and GPU (Graphical
Processing Unit) options, allowing users to run code that requires intensive
computational power, such as machine learning or deep learning tasks. The availability
of GPU resources can significantly speed up the execution of these computationally
demanding algorithms. Colab also integrates with other Google services. It provides
seamless access to Google Drive, allowing users to import and export files, including
datasets, code scripts, and trained models. Moreover, it enables collaboration by
allowing multiple users to work on the same notebook simultaneously, making it
suitable for team projects or educational purposes.
• Flask: Flask is a micro web framework written in Python. It is designed to be simple
and lightweight, providing the basic tools and features needed to build web
applications. Flask follows the principle of simplicity, focusing on providing a solid
foundation for web development while allowing developers the freedom to choose and
integrate additional libraries and tools as needed. Flask allows developers to define
routes for different URLs or endpoints of a web application. With Flask's route
decorators, developers can specify the URL patterns and associated functions that
handle incoming requests. Flask supports common HTTP methods like GET, POST,
PUT, DELETE, etc. Flask includes a template engine, called Jinja, which enables
developers to generate dynamic HTML content. Templates can be used to separate the
16
presentation layer from the application logic, making it easier to maintain and modify
web pages. Flask is widely used for building small to medium-sized web applications,
RESTful APIs, and microservices. Its simplicity, flexibility, and extensibility make it a
popular choice among Python developers who value lightweight frameworks and the
ability to customize and integrate additional components as needed.
• Deep Learning: Deep learning is a subfield of machine learning that focuses on building
and training artificial neural networks inspired by the structure and function of the
human brain. It aims to enable computers to learn and make complex decisions or
predictions by simulating the behaviour of interconnected neurons in neural networks.
Deep learning relies on artificial neural networks, which are composed of
interconnected layers of artificial neurons or nodes. These networks are designed to
mimic the behaviour of the human brain, where each neuron takes inputs, applies
weights, performs computations, and produces an output. Deep learning emphasizes the
use of deep neural networks, which have multiple hidden layers between the input and
output layers. The term "deep" refers to the depth of the network, as it has a large
number of layers compared to traditional neural networks. Deep learning algorithms
require large amounts of labelled training data to effectively learn and generalize. With
the advent of big data and advancements in computing power (including GPUs), deep
learning has become more feasible and has achieved impressive results in various
domains, such as image recognition, natural language processing, and speech
recognition.
• Visual Studio code: Visual Studio Code (VS Code) is a lightweight and versatile source
code editor developed by Microsoft. It is available for Windows, macOS, and Linux
and is widely used by developers for various programming languages and frameworks.
VS Code provides a rich and intuitive code editing experience. VS Code has a vast
extension marketplace that offers a wide range of extensions developed by the
community. These extensions provide additional functionalities, such as language
support, debugging capabilities, code snippets, version control integration, and more.
Developers can customize and enhance their editor's functionality by installing the
extensions that suit their needs. VS Code allows the configuration and execution of
various tasks, such as building, testing, and running applications, through its integrated
task runner. Users can define custom tasks or leverage predefined task configurations
for popular tools and frameworks, streamlining common development workflows.
17
Chapter 5
IMPLEMENTATION
i)Dataset Creation:
To create our custom dataset for YOLO models, we collected a variety of images featuring
different types of weapons. These images were manually labeled using an open-source website
called Make Sense. Make Sense is a popular tool used for creating datasets specifically for
YOLO (You Only Look Once) models, which are widely used for object detection tasks.
In total, we gathered 900 labeled images for our dataset, with 100 images dedicated to each
selected category of weapon. This ensured a diverse range of weapon types and variations
within each category. The purpose of such labeling is to provide ground truth annotations for
the model to learn from, enabling it to accurately identify and classify weapons in real-world
scenarios.
Additionally, we created a separate validation dataset consisting of 180 labeled images. Within
this validation set, there were 18 images for each category of weapon. The validation dataset
serves as an independent sample to assess the performance and generalization capability of the
trained YOLO models.
By manually labelling the images and creating this custom dataset, we have taken an essential
step in training and evaluating our YOLO models for weapon detection. The dataset's diversity
and the inclusion of a validation set will contribute to a robust and reliable model capable of
accurately detecting various types of weapons in different settings.
18
Fig 5.1: Dataset Labelling using MakeSenseAI
19
Classes of Data:
Another crucial component of the YAML file is specifying the classes or categories of data
present in the dataset. In our case, since we have selected different categories of weapons, we
would include a list of these weapon classes in the configuration file. For example, the classes
might include "handguns," "rifles," "knives," "explosives," and other relevant categories based
on the specific types of weapons chosen.
Model Configuration:
The YAML file also includes various parameters related to the model architecture and training
process. These parameters might include the network backbone, the number of anchor boxes
to be used, the input image size, the learning rate, batch size, and other hyperparameters. These
settings define the model's architecture and guide the training process.
By creating a YAML configuration file with the relevant paths, classes, model settings, and
optimization parameters, we ensure that the training process is well-defined and can be
executed smoothly. The configuration file serves as a blueprint for training the YOLO model
on our custom dataset, enabling us to fine-tune the model to accurately detect weapons based
on the labeled images and annotations we have prepared.
20
Fig 5.2: YAML Configuration File
iii)Training:
To train our YOLO model using the YOLOv8m algorithm for object detection, we followed a
two-step process: pretraining on a large-scale dataset and fine-tuning on our collected weapon
dataset. Here are the details and parameters we used during the training process:
3.Hyperparameter Configuration:
During the training process, we adjusted several hyperparameters to optimize the performance
of our YOLO model. Here are the specific parameters we used:
21
- Batch Size: We set the batch size to 16. This parameter determines the number of images
processed in each iteration during training. It affects memory usage and training speed. A larger
batch size can lead to faster convergence but requires more memory.
- Epochs: We trained the model for 300 epochs. An epoch represents one complete pass through
the entire training dataset. Training for multiple epochs allows the model to learn from the data
repeatedly and refine its performance over time.
- Image Size (imgsz): We set the image size to 640. This parameter determines the resolution
at which the images are processed during training. Larger image sizes can capture more details
but require more computational resources. The choice of image size depends on the specific
requirements of the task and available resources.
- Resume: We set the "resume" parameter to False. This indicates that we are starting the
training from scratch and not resuming from a previous checkpoint or trained model.
Throughout the training process, it is common to monitor metrics such as loss values, mAP
(mean Average Precision), and IoU (Intersection over Union) scores to evaluate the model's
progress and make further adjustments if necessary. Fine-tuning the pretrained YOLO model
using our custom weapon dataset with the provided hyperparameters helps the model specialize
in detecting weapons accurately, enhancing its practical applicability in real-world scenarios.
Fig 5.3: Command for training YOLO for our custom data.
iv)Evaluation
After training our YOLOv8m model on the weapon detection dataset, we proceeded to evaluate
its performance using the validation subset. The evaluation process involved measuring the
mean average precision (mAP), which is a commonly used metric for assessing the accuracy
of object detection models. Here are the evaluation results we obtained:
22
1. mAP50: 87.2%
mAP50 measures the average precision at a IoU (Intersection over Union) threshold of 0.5. It
represents how well the model accurately localizes and classifies weapons when there is at least
a 50% overlap between the predicted bounding boxes and the ground truth annotations. A
mAP50 of 87.2% indicates that the model performs well in detecting weapons with a
reasonable degree of accuracy.
2. mAP50-90: 70.3%
mAP50-90 measures the average precision over a range of IoU thresholds from 0.5 to 0.9, in
increments of 0.05. It provides a broader evaluation of the model's performance across a range
of IoU thresholds, indicating how well the model generalizes to different levels of bounding
box overlap. An mAP50-90 of 82.3% suggests that the model maintains good accuracy across
a wider range of IoU thresholds.
These evaluation results demonstrate that our trained YOLOv8m model performs well in
detecting weapons in the validation subset of our dataset. The mAP50 of 87.2% indicates a
high degree of accuracy in localizing and classifying weapons with a 50% IoU threshold, while
the mAP50-90 of 82.3% reflects the model's ability to generalize well to different levels of
bounding box overlap.
It is worth noting that evaluation metrics can vary depending on the specific requirements and
thresholds set for the object detection task. Additionally, it is important to consider other factors
such as the balance between precision and recall, false positive and false negative rates, and
the specific application context when assessing the overall effectiveness of the model.
These evaluation results provide valuable insights into the performance of our trained
YOLOv8m model and indicate its potential for accurate weapon detection. However, further
testing and evaluation in real-world scenarios are recommended to validate its effectiveness
and ensure reliable performance in practical applications.
v)Implementation
23
Implementing a real-time streaming application for weapon detection involves integrating a
Flask web application with OpenCV to capture live video, convert it into frames, and pass each
frame through our custom YOLO model for detection. Here's a more detailed explanation of
the process:
24
By integrating a Flask web application with OpenCV and our custom YOLO model, we are
able to capture live video, convert it into frames, perform weapon detection on each frame in
real-time, and display the results back to the user through a web interface. This implementation
enables us to create a practical and interactive application for real-time weapon detection using
our trained YOLO model.
vi)Detection
During the real-time video streaming process, we applied our trained YOLO model to each
frame to detect and localize weapons. This involved drawing bounding boxes around the
detected weapons and displaying the confidence score of each detection. Here's a more detailed
explanation of the implementation:
25
chance of false positives, while higher thresholds may lead to fewer detections but with
increased accuracy.
By implementing this process, we were able to leverage the YOLO model to detect and localize
weapons in real-time video streams. The bounding boxes, confidence scores, and continuous
output display facilitated easy visualization and understanding of the detected weapons,
enabling users to assess the situation promptly and take appropriate actions if necessary.
26
Chapter 6
Figure 6.1
Fig 6.1: Specified the window pixels and initiation of the flask module
Fig 6.2: Specified the window pixels and logs of the flask module
27
Fig 6.3: Information of the layers, parameters, gradient and GFLOPs detected by the YOLOv8 model
Fig 6.4: Output of the model shown on the User Interface in Realtime
Our model achieved a mAP of 87.2% over 6 categories of weapons which is very good for any detection
model. Our YOLOv8m model is very accurate in detecting weapons and can also detect multiple weapons in a
single frame. These are few results of that are predicted on live stream through our flask web application.
28
Fig 6.6: Model detection various types of pistols
29
Chapter 7
CONCLUSIONS AND FUTURE SCOPE
7.1 Conclusion
YOLOv8 is the latest release in the family of YOLO models, defining a new state-of-
the-art in object detection. When benchmarked on Roboflow 100, we saw a significant
performance boost between v8 and v5.
In conclusion, this research paper delved into the application of YOLOv8m (You Only
Look Once) for real-time weapon detection in streaming environments. Through extensive
experimentation and meticulous analysis, the study showcased the effectiveness and efficiency
of YOLO in swiftly and accurately identifying weapons. Integrating YOLO into real-time
streaming systems has significant implications for bolstering security measures, expediting
threat recognition, and enabling prompt responses in critical scenarios. The findings underscore
YOLO's potential as a valuable tool for real-time weapon detection across diverse domains,
such as public spaces, transportation hubs, and critical infrastructure, thereby augmenting
safety and security. Future research endeavours aimed at advancing and optimizing YOLOv8m
and its related techniques hold tremendous promise in advancing ongoing initiatives to ensure
public safety and security.
30
REFERENCES
[1] González, Jose Luis & Zaccaro, Carlos & Alvarez-Garcia, Juan & Soria Morillo, Luis
& Caparrini, Fernando. (2020). Real-time gun detection in CCTV: An open problem.
Neural networks : the official journal of the International Neural Network Society. 132.
297-308. 10.1016/j.neunet.2020.09.013.
[4] G. Chandan, A. Jain, H. Jain and Mohana, "Real Time Object Detection and Tracking
Using Deep Learning and OpenCV," 2018 International Conference on Inventive
Research in Computing Applications (ICIRCA), Coimbatore, India, 2018, pp. 1305-
1308, doi: 10.1109/ICIRCA.2018.8597266.
[5] Asad, Muhammad & Hashmi, Tufail & Rasheed, Osama. (2023). Multiplatform
Surveillance System for Weapon Detection using YOLOv5.
10.1109/ICET56601.2022.10004690.
[6] YOLOv4: Optimal Speed and Accuracy of Object Detection by Alexey Bochkovskiy,
Chien -Yao Wang, Hong- Yaun Mark Liao.
[7] Efficient Weapon Detection in Surveillance Videos using SSD by Nguyen et al.
(2020):
[8] https://towardsdatascience.com/step-by-step-yolo-model-deployment-in-localhost-
using-python-8537e93a1784
[9] https://www.makesense.ai/
31
[10] https://docs.ultralytics.com/reference/hub/auth
[11] https://www.augmentedstartups.com/blog/10-things-you-need-to-know-about-
ultralytics-yolov8
[12] https://blog.roboflow.com/whats-new-in-yolov8/
[13] https://ultralytics.com/article/Introducing-Ultralytics-YOLOv8
[14] https://blog.roboflow.com/how-to-train-yolov8-on-a-custom-dataset/
[15] https://labelstud.io/blog/quickly-create-datasets-for-training-yolo-object-
detection-with-label-studio/
32
APPENDIX
33
34
35