Final_Year_Synopsis_Format (2)
Final_Year_Synopsis_Format (2)
Final_Year_Synopsis_Format (2)
ON
1. Develop a platform for achieving globally acceptable level of intellectual acumen and
technological competence.
2. Create an inspiring ambience that raises the motivation level for conducting quality
research.
“To spark the imagination of the Computer Science Engineers with values,skills and cre-
ativity to solve the real-world problems.”
2. To empower professionals with core competency in the field of Computer Science and
Engineering.
3. To foster independent and lifelong learning with ethical and social responsibilities.
ii
PROGRAM OUTCOMES(POs)
iii
receive clear instructions.
PO11: Project management and finance: Demonstrate knowledge and understanding of
the engineering and management principles and apply these to one’s own work,as a member
and leader in a team, to manage projects and in multidisciplinary environments.
PO12: Life-long learning: Recognize the need for,and have the preparation and ability to
engage in independent and life-long learning in the broadest context of technological change.
PEO1: To apply computational skills necessary to analyze, formulate and solve engineering
problems.
PEO2: To establish a entrepreneurs,and work in interdisciplinary research and development
organizations as an individual or in a team.
PEO3: To inculcate ethical values and leadership qualities in students to have a successful
career.
PEO4: To develop analytical thinking that helps them to comprehend and solve real-world
problems and inherit the attitude of lifelong learning for pursuing higher education.
iv
Course Outcomes(COs)
C410.1: Identify, formulate, design and analyze a research based/web based problem.
C410.2: Communicate effectively in verbal and written form
C410.3: Apply appropriate computing, and engineering skills for obtaining solution to the
formulated problem within a stipulated time.
C410.4: Work effectively as a part of team in multi-disciplinary areas.
C410.5: Consolidate the final outcome in the form of a publication.
CO-PO-PSO Mapping
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2
C410.1 3 3 3 3 2 3 3 3 3 3 2 3 3 3
C410.2 2 2 2 2 2 2 2 2 2 3 2 3 2 3
C410.3 3 3 3 3 3 3 2 3 3 3 3 3 3 3
C410.4 3 3 3 3 2 3 2 3 3 3 3 3 3 3
C410.5 3 3 3 3 3 3 3 3 3 3 3 3 3 3
C410 2.80 2.80 2.80 2.80 2.40 2.80 2.40 2.80 2.80 3.00 2.60 3.00 2.80 3.00
v
DECLARATION
We hereby declare that this synopsis submission is our own work and that, to the best of
our knowledge and belief, it contains no material previously published or written by another
person nor material which to a substantial extent has been accepted for the award of any
other degree or diploma of the university or other institute of higher learning, except where
due acknowledgment has been made in the text.
(Candidate Signature)
(Candidate Signature)
(Candidate Signature)
vi
CERTIFICATE
This is to certify that project synopsis report entitled “Detecting oil spills at marine envi-
ronment using Automatic Identification System (AIS) and satellite datasets”
which is done by Tushar Pandey (2100910100176), Vaani Pathariya(2100910100178),
Vaibhav Anand(2100910100179) in partial fulfillment of the requirement for the award of
degree B. Tech. in Department of Computer Science and Engineering of Dr. APJ Abdul
Kalam Technical University,Uttar Pradesh, Lucknow is a record of the candidate’s own work
carried out by him/her under my supervision. The matter embodied in this report is original
and has not been submitted for the award of any other degree.
Signature
Date:
vii
Detecting oil spills at marine environment using
Automatic Identification System (AIS) and satellite
datasets
ABSTRACT
Detecting oil spills in marine environments is crucial for preventing environmental dam-
age and facilitating rapid response efforts. This study proposes a robust method for oil
spill detection by leveraging state-of-the-art (SOTA) deep learning techniques. We created a
comprehensive dataset consisting of images and frames extracted from videos sourced from
Google, significantly augmenting the dataset using frame extraction techniques. Each im-
age was meticulously labeled to ensure high-quality training data. The Yolov8 segmentation
model was employed to train our oil spill detection model, enabling it to accurately iden-
tify and segment oil spills in ocean environments. To further enhance detection accuracy,
K-means clustering and Truncated Linear Stretching algorithms were integrated with the
trained model’s weights. The model demonstrated exceptional performance, achieving high
detection accuracy and precise segmentation capabilities. In the training phase, the model
reached an accuracy exceeding 97% after 100 epochs. During evaluation, the model achieved
remarkable detection rates, with an F1 score of 94%, precision of 93.9%, and mean average
precision (mAP) at 0.5 IoU of 95.5%. These results underscore the effectiveness of this
approach for real-time oil spill detection, making it a promising tool for environmental mon-
itoring and disaster management in marine ecosystems. This method shows great potential
for enhancing oil spill detection, particularly in areas where rapid identification and response
are critical. The integration of deep learning models, along with innovative algorithms for
image enhancement, offers significant improvements in accuracy, precision, and overall per-
formance, demonstrating its potential to address key challenges in marine pollution detection
and environmental protection.
viii
TABLE OF CONTENTS
Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Certificate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
CHAPTER 1 INTRODUCTION 1
CHAPTER 2 OBJECTIVE(S) 3
CHAPTER 8 CONCLUSION 16
ix
CHAPTER 1
INTRODUCTION
Marine oil spills represent a significant environmental threat, causing severe ecolog-
ical damage and economic loss to coastal communities and wildlife. These spills,
which occur during various stages of the oil lifecycle—from exploration to transporta-
tion—pose immediate risks to marine ecosystems. Detecting and monitoring oil spills
in a timely manner is crucial for mitigating their adverse effects, facilitating swift
response actions, and minimizing long-term damage. Traditional methods for detect-
ing and monitoring oil spills, such as satellite imagery, aerial surveillance, and man-
ual reporting, often face limitations due to weather conditions, coverage areas, and
time-sensitive response needs. Recent advancements in remote sensing technologies,
including synthetic aperture radar (SAR), hyperspectral, and multispectral imaging,
have improved the accuracy and scope of oil spill detection. However, challenges re-
main in distinguishing oil slicks from other phenomena, such as planktonic algae or
shallow sea areas, and handling the effects of variable environmental factors like wind
speed. While SAR technology can identify oil-contaminated seawater based on differ-
ences in backscattering coefficients, its accuracy is often impacted by external factors.
The emergence of polarimetric SAR (PolSAR) has further enhanced detection capa-
bilities but requires sophisticated feature extraction algorithms to handle its complex
data. In the face of these challenges, deep learning techniques have emerged as a pow-
erful solution for automatic oil spill detection. These models, particularly in computer
vision, can automatically extract features from raw data, enhancing the precision of
detection tasks. Recent studies have shown that deep learning models, such as YOLO
(You Only Look Once) and its variants, can effectively detect and segment oil spills in
marine environments. However, the success of these models depends heavily on the
quality and diversity of the dataset used for training, as well as the ability to handle the
1
dynamic and complex nature of the marine environment.
In this study, we present a robust method for oil spill detection in marine environ-
ments that leverages state-of-the-art (SOTA) deep learning techniques, specifically the
YOLOv8 segmentation model. We propose a comprehensive approach that involves
the construction of a custom, high-resolution dataset of oil spill images, which were
meticulously labeled using semantic segmentation. The dataset was sourced from a
variety of internet repositories, providing a diverse set of scenarios that include vary-
ing oil slick shapes, sizes, and environmental conditions. This dataset is critical for
ensuring high-quality training data for deep learning models, which is a key factor
in achieving reliable and accurate detection results. Our approach further enhances
the performance of the YOLOv8 segmentation model by incorporating two advanced
image processing techniques: the K-means algorithm and Truncated Linear Stretch-
ing. These methods are integrated with the trained model weights to improve detec-
tion accuracy and robustness. The model is trained to detect and segment oil spills
effectively, achieving an impressive accuracy of over 97% after 100 epochs. In evalu-
ation, the model demonstrated high performance with an F1 score of 94%, Precision
of 93.9%, and mAP@0.5 Recall accuracy of 95.5%. The primary contributions of
this paper include the creation of a custom oil spill dataset, the development of a deep
YOLOv8-based segmentation model, and the integration of K-means and Truncated
Linear Stretching methods to optimize detection accuracy. This research aims to pro-
vide a comprehensive solution to oil spill detection that is not only accurate but also
adaptable to the dynamic nature of the marine environment. By improving detection
capabilities through these advanced techniques, we hope to contribute to more efficient
environmental monitoring and disaster management efforts.
2
CHAPTER 2
OBJECTIVE(S)
3
CHAPTER 3
This project focuses on the development and optimization of a comprehensive oil spill detec-
tion model for marine environments, using advanced deep learning techniques, particularly
the YOLOv8 segmentation model. The scope of the project encompasses several key areas
aimed at improving the effectiveness and accuracy of oil spill detection systems in real-world
applications. These areas are outlined below: 1. Dataset Creation and Augmentation Custom
Oil Spill Dataset: The project involves the creation of a high-resolution, custom dataset of
oil spill images sourced from various online repositories, including videos. This dataset in-
cludes diverse environmental conditions, oil spill shapes, and sizes to ensure that the model
is trained to handle real-world variability in marine environments. Semantic Segmentation
for Labeling: Each image in the dataset is meticulously labeled using a semantic segmenta-
tion approach, ensuring precise delineation of oil spill regions. This highquality labeling is
crucial for training deep learning models and achieving high detection accuracy. Data Aug-
mentation: To improve the robustness of the model, the dataset is augmented using various
techniques, including frame extraction from videos, to increase the number of training sam-
ples and improve model generalization. 2. Deep Learning Model Development YOLOv8
Segmentation Model: The core of the project revolves around fine-tuning the YOLOv8 seg-
mentation model, a state-of-the-art deep learning architecture known for its high efficiency
and real-time object detection capabilities. YOLOv8 is adapted to specifically identify and
segment oil spills in marine environments, which are often irregular in shape and vary in size.
Model Training and Optimization: The model is trained on the custom dataset, incorporat-
ing advanced optimization techniques, including K-means clustering and Truncated Linear
Stretching. These methods help improve detection accuracy by addressing issues like noise,
varying lighting conditions, and diverse oil slick forms. 5 3. Integration of Advanced Image
Processing Techniques K-means Clustering: This algorithm is used to improve the model’s
4
ability to identify distinct regions of oil spills by clustering image pixels based on their simi-
larity. It helps the model focus on key features and reduce the impact of irrelevant data, such
as background noise. Truncated Linear Stretching: This technique is applied to enhance
image contrast and make oil slicks more distinguishable from the background, improving
the model’s ability to detect spills under challenging conditions like varying water textures
and lighting. 4. Evaluation and Performance Metrics Model Evaluation: After training, the
model’s performance is evaluated using key metrics such as accuracy, F1 score, Precision,
and Recall. These metrics provide insights into the model’s ability to correctly identify and
segment oil spills. Real-world Testing: The model’s performance is tested under different
marine environmental conditions, including various weather scenarios, oil types, and water
surfaces. This ensures that the model can handle the complexity and variability of real-
world marine environments. 5. Comparative Analysis Comparison with Existing Methods:
The project includes a comparative analysis of the YOLOv8-based model with traditional
oil spill detection methods, such as SAR (Synthetic Aperture Radar) imaging and PolSAR
(Polarimetric SAR). This allows for a thorough assessment of the advantages and limitations
of the deep learning-based approach in comparison to established techniques. Benchmark-
ing Against State-of-the-Art Models: The performance of the proposed model is compared
with existing deep learning models and algorithms, highlighting improvements in detection
accuracy, segmentation precision, and overall performance. 6. Real-Time Detection Poten-
tial Application for Environmental Monitoring: One of the key goals of the project is to
develop a model capable of real-time oil spill detection. This can be used for rapid response
efforts, allowing authorities to take timely actions to mitigate the impact of spills on marine
ecosystems. Disaster Management: The system could be integrated into disaster manage-
ment frameworks, providing a tool for continuous surveillance and assessment of marine oil
spills. 6 This would enhance the ability to monitor oil spills in large-scale marine environ-
ments effectively. 7. Future Directions and Scalability Scalability to Other Marine Environ-
ments: While this study focuses on oil spill detection, the methods and techniques developed
could be adapted for detecting other types of environmental hazards or monitoring marine
ecosystems. Further Optimization: Future work may explore additional optimizations to im-
prove model performance, such as incorporating more advanced deep learning architectures
or expanding the dataset to include a wider range of oil types, environmental conditions, and
spill scenarios jbjcxnkcndflm
5
kdsnknksdndsk
mskdkmdskm
6
CHAPTER 4
RELATED WORK
Marine oil spill detection primarily relies on SAR and optical images. SAR imagery is
favored for its all-weather and day-night capabilities, while optical imagery is limited by
weather and daylight conditions. Optical image spectral characteristics vary due to factors
like oil type, thickness, and environmental conditions, making consistent detection challeng-
ing. Multispectral datasets, such as MODIS, Landsat, and KOMPSAT-2, are widely used in
oil spill studies, often enhanced by machine learning (ML) techniques. Traditional ML mod-
els like SVM, KNN, and random forest classify spills using features like geometry, texture,
and polarimetry. Deep learning models like CNNs and full convolutional networks (e.g.,
AConcNets) address scalability and adaptability issues, improving accuracy. Advancements
in using fluorescence and rotation-absorption indices from hyperspectral imagery have re-
duced false positives and improved mapping precision. Near-infrared (NIR) and shortwave
infrared (SWIR) bands effectively detect spills in challenging conditions, enhancing the re-
liability of remote sensing for environmental monitoring. 4.1 Color Attributes for Object
Detection Object detection is a challenging task in computer vision due to factors like per-
spective, scale, and occlusion. Traditional methods often focus on intensity-based features,
excluding color due to variations from lighting, shadows, and compression. However, com-
bining color and shape features has shown improved performance in image classification. Se-
mantic segmentation, a technique used in remote sensing, performs pixel-level classification,
accurately mapping elements like oil spills and ships. It excels at delineating precise bound-
aries, enhancing image interpretation. Recent advancements use selective search techniques,
like hierarchical segmentation, to improve object detection, ensuring comprehensive recog-
nition with high accuracy across diverse environments. 4.2 You Only Look Once (YOLO)
Yang et al. evaluated the YOLO-v4 algorithm for detecting marine oil spills, even under
challenging conditions like shadows and low light. They developed a specialized dataset to
7
8 test the algorithm’s effectiveness but faced limitations due to YOLO-v4’s reliance on pre-
defined anchor sizes, which struggle with the varying shapes of oil spills in dynamic marine
environments. Zhang et al. improved this by introducing the YOLOx-S model, which ad-
dressed inconsistent SAR image contrast. They used a truncated linear stretch for contrast
enhancement and leveraged CspDarknet and PANnet for feature extraction, enhancing the
algorithm’s ability to detect oil spills in marine settings.
8
CHAPTER 5
PROPOSED METHOD
In this study, we propose a new optical oil spill dataset and train an oil spill detection model
by fine-tuning YOLO-v8. Moreover, we employ a combination of unsupervised machine
learning techniques to enhance the accuracy of detecting marine oil spills. From the anal-
ysis, it can be seen that traditional methods often struggle with the diverse and challenging
visual characteristics of oil spills, such as varying color, textures, and contrast levels, partic-
ularly in SAR imagery. To address these challenges, we integrate SOTA algorithms aimed at
enhancing image contrast and segmentation, thereby facilitating identification that is more
precise in the delineation of oil spill areas. 5.1. K-Means Clustering for Color Segmen-
tation K-Means clustering is a widely used unsupervised machine learning algorithm for
segmenting images based on color similarity, making it useful for detecting oil spills, which
often have unique color characteristics. The algorithm groups pixels with similar colors into
clusters, enhancing the discrimination between oil spills and the surrounding water. By clus-
tering pixels based on their color values, K-means helps to identify regions that correspond
to oil spills, despite variations caused by factors such as lighting, weather, and water move-
ment. In this approach, we cluster a dataset of P points into K clusters, where K represents
the number of desired clusters. For oil spill detection, K is typically set to 3, corresponding
to the primary Red, Green, and Blue (RGB) color channels used for training the oil spill
model. Each cluster has a corresponding centroid c1,c2. . . ,ck, where the centroid ck of the
k-th cluster is the average of all points belonging to that cluster. 10 Here, Sk , represents the
set of points belonging to the k-th cluster, and Xp ,represents the pixel values in the cluster.
For each pixel Xp, it is assigned to the cluster whose centroid minimizes the distance:
To apply K-means clustering in image segmentation, the image, represented in the HSV
color space, is reshaped from a 3D array (height, width, color channels) into a 2D array,
where each row corresponds to a pixel’s HSV values. The algorithm then assigns clusters
9
and reconstructs the segmented image by mapping the cluster center values back to the pix-
els. The output is a segmented image that isolates regions corresponding to oil spills. 5.2
Truncated Linear Stretching (TLS) Truncated Linear Stretching (TLS) is a technique used
to enhance image contrast by adjusting pixel intensity values within a specific range. This
method focuses on stretching the pixel intensities to utilize the full range of values more ef-
fectively, while ignoring extreme outliers that could skew the stretching process. The "trun-
cated" aspect means only a subset of the pixel values, defined by the lower (LP) and upper
(UP) percentiles, are adjusted. To calculate the percentiles (LP and UP), we first compute the
2nd and 98th percentiles of the image’s intensity distribution: These values define the range
within which pixel values will be stretched. Next, TLS scales the pixel values between the
lower and upper percentiles using the following equation: The np.clip function ensures that
pixel values are within the defined range [LP, UP]. After clipping, a linear transformation is
applied to scale the pixel values to the full dynamic range [0, 255]: This transformation ad-
justs the pixel values so that the lower percentile (LP) maps to 0 and the upper percentile (UP)
maps to 255, enhancing contrast and improving feature detection. 5.3. Oil Spill Detection
with YOLO-v8 YOLO-v8, developed by Ultralytics as an enhanced version of YOLO-v5,
brings significant improvements to the YOLO series, which is renowned for real-time object
detection. The model adopts a decoupled head and anchor-free design, which separates the
tasks of objectness prediction, classification, and bounding box regression, improving overall
accuracy. This decoupling allows the model to focus on each task independently, leading to
more precise detections. The anchor-free approach eliminates the need for predefined anchor
boxes, allowing YOLO-v8 to directly predict the object’s location and bounding box, mak-
ing it more flexible and adaptable to varying object sizes, scales, and shapes. Additionally,
YOLO-v8 utilizes a modified CSPDDarknet53 backbone, which incorporates Cross-Stage
Partial (CSP) connections to enhance gradient flow and reduce computational cost, ensuring
more efficient training and inference. The model also uses a sigmoid activation function to
predict the objectness score, representing the likelihood of an object being present in the
bounding box.
Figure.1 12 3.4. Data Preparation The initial phase of this research involved the system-
atic collection of a comprehensive dataset of oil spill images. A targeted search was con-
ducted to gather publicly available images depicting oil spills. These images were sourced
from various internet sources. Also, we extended the dataset by videos. Videos contain-
10
ing footage of oil spills were downloaded, and individual frames were extracted from these
videos. This method ensured the dynamic and varied perspectives of oil spills were included
in the dataset, enhancing the robustness of the dataset. Example images are shown in Figure
2 and Figure 3.
Figure.2
Figure.3In this study, we propose a new optical oil spill dataset and train an oil spill
detection model by fine-tuning YOLO-v8. Moreover, we employ a combination of unsu-
pervised machine learning techniques to enhance the accuracy of detecting marine oil spills.
From the analysis, it can be seen that traditional methods often struggle with the diverse
and challenging visual characteristics of oil spills, such as varying color, textures, and con-
trast levels, particularly in SAR imagery. To address these challenges, we integrate SOTA
algorithms aimed at enhancing image contrast and segmentation, thereby facilitating iden-
tification that is more precise in the delineation of oil spill areas. 5.1. K-Means Clustering
for Color Segmentation K-Means clustering is a widely used unsupervised machine learning
algorithm for segmenting images based on color similarity, making it useful for detecting oil
spills, which often have unique color characteristics. The algorithm groups pixels with sim-
ilar colors into clusters, enhancing the discrimination between oil spills and the surrounding
water. By clustering pixels based on their color values, K-means helps to identify regions
that correspond to oil spills, despite variations caused by factors such as lighting, weather,
and water movement. In this approach, we cluster a dataset of P points into K clusters, where
K represents the number of desired clusters. For oil spill detection, K is typically set to 3,
corresponding to the primary Red, Green, and Blue (RGB) color channels used for training
the oil spill model. Each cluster has a corresponding centroid c1,c2. . . ,ck, where the cen-
troid ck of the k-th cluster is the average of all points belonging to that cluster. 10 Here, Sk ,
represents the set of points belonging to the k-th cluster, and Xp ,represents the pixel values
in the cluster. For each pixel Xp, it is assigned to the cluster whose centroid minimizes the
distance:
To apply K-means clustering in image segmentation, the image, represented in the HSV
color space, is reshaped from a 3D array (height, width, color channels) into a 2D array,
where each row corresponds to a pixel’s HSV values. The algorithm then assigns clusters
and reconstructs the segmented image by mapping the cluster center values back to the pix-
els. The output is a segmented image that isolates regions corresponding to oil spills. 5.2
11
Truncated Linear Stretching (TLS) Truncated Linear Stretching (TLS) is a technique used
to enhance image contrast by adjusting pixel intensity values within a specific range. This
method focuses on stretching the pixel intensities to utilize the full range of values more ef-
fectively, while ignoring extreme outliers that could skew the stretching process. The "trun-
cated" aspect means only a subset of the pixel values, defined by the lower (LP) and upper
(UP) percentiles, are adjusted. To calculate the percentiles (LP and UP), we first compute the
2nd and 98th percentiles of the image’s intensity distribution: These values define the range
within which pixel values will be stretched. Next, TLS scales the pixel values between the
lower and upper percentiles using the following equation: The np.clip function ensures that
pixel values are within the defined range [LP, UP]. After clipping, a linear transformation is
applied to scale the pixel values to the full dynamic range [0, 255]: 11 This transformation
adjusts the pixel values so that the lower percentile (LP) maps to 0 and the upper percentile
(UP) maps to 255, enhancing contrast and improving feature detection. 5.3. Oil Spill Detec-
tion with YOLO-v8
12
CHAPTER 6
13
hihewiohsakjhkjhsakjnkjasnkjasnknask hdfjhdfkjhkdfkdfdfkjdf jhdsjkjkdsnkjds jsdjksdkjdb-
snjkdskjbds sjkjdsjkdsknds
14
CHAPTER 7
15
CHAPTER 8
CONCLUSION
16
REFERENCES
17