paper_hal
paper_hal
paper_hal
Abstract—The automatic detection and characterization of resolution (VHR) (0.3 m resolution) [8] and high-resolution
ships in optical remote sensing images is a key challenge for (HR) (0.4 − 2 m resolution) [9] imagery. However, these
maritime surveillance applications. This paper presents an satellites images are typically accessible exclusively through
automated system specically designed for ship detection in
medium-resolution Sentinel-2 images. The proposed approach tasking modes, resulting in high acquisition costs and a limited
relies on a deep learning model trained on a dataset comprising interest for continuous monitoring and surveillance tasks. By
over 6000 annotated Sentinel-2 images. It achieves a detection contrast, medium-resolution (MR) optical imagery (typically,
rate of 93%, with an average of 2.1 to 3.9 false alarms per a 10 m resolution), as deployed on Sentinel-2, delivers freely
Sentinel-2 image. Besides the detection task, it also addresses the available remote sensing images for numerous locations on
estimation of ship lengths as well as ship headings. It yields a
mean error of 15.36m ± 19.57m for ship lengths, and estimates Earth with a revisit time ranging between 5 and 10 days. This
ship headings with an accuracy of 93%. This contribution seems particularly adapted to maritime surveillance tasks. Yet,
signicantly enhances the performance of ship detection and only few studies [7] have addressed the automated detection
characterization systems in optical remote sensing imagery. and characterization of ships in MR optical satellite imagery.
Here, we address these challenging issues and present
Index Terms: Deep Neural Network, Sentinel-2 Images, Ship
Detection, Ship Characterization a deep learning approach. We rst collect a representative
groundtruthed dataset comprising more than 12000 ship ex-
amplars from 6000 Sentinel-2 images. Our multi-task deep
I. I NTRODUCTION learning scheme relies on a Faster R-CNN. Our numerical
Spaceborne remote sensing imagery conveys invaluable in- experiments explore data splitting strategies during the training
formation for the monitoring of maritime activities, especially phase to account for class imbalance. We also asses the impact
the maritime trafc. This is of critical importance for both of the neural backbone of the Faster R-CNN architecture.
surveillance and defense issues [1]. We may cite among Overall, we report state-of-the-art performance with a detec-
others the monitoring of maritime borders, the identication tion rate above 93% including for small ships with a ship
of illegal maritime behaviours, the ght against illegal shing length below 20 m. These results support the relevance MR
and smuggling activities, search and rescue operations... optical imagery for maritime monitoring besides HR optical
We can distinguish two main categories of satellite imagery and SAR imagery.
for maritime surveillance topics: Synthetic Aperture Radar This paper is organized as follows. In Section II, we provide
(SAR) imagery [2] and optical imagery [3]. Currently, SAR an overview of related work and analyze their drawbacks. The
imagery is more widespread due to its applicability both details of the proposed approach are presented in Section III.
day and night and in all weather conditions (e.g., cloudy Section IV substantiates the relevance of our proposed meth-
conditions). Ship detection in SAR images [4] relies on ods through experiments. Discussion is covered in Section V,
relatively simple low-level image processing schemes for and the conclusion is presented in Section VI.
ships which possess a metallic structure highly responsive to
radar signals. This may lead to detection ambiguities along the
II. R ELATED WORK AND M OTIVATION
seashore, where other metallic structures (buoys, pontoons,
etc.) cause high-amplitude patterns in SAR images [5], as Recent review papers [7], [10] provide surveys on ship
well as in the case of sea clutter [6]. Additionally, inatable detection and characterization in optical satellite images.
and wooden boats like Zodiacs and plastic boats can hardly They compile over a hundred research articles dating back
be identied in SAR images. By contrast, optical imagery to 1978, providing a comprehensive overview of the subject.
offers a practical alternative for ship detection, increasing not The majority of these studies have investigated ship detection
only detection capabilities in complex environments but also using High-Resolution (HR) and Very High-Resolution
providing the ability to detect all types of ships. (VHR) images, employing deep learning approaches such as
Faster R-CNN [11], YOLO [12] and U-net [13]. However, our
In recent years, ship detection from optical imagery has seen specic challenge is different from working with high or very
increased research effort, with a focus on deep learning meth- high-resolution (<5 m) images. In these cases, identifying
ods [7]. Most studies focus on ship detection from very-high- ships among other objects is relatively straightforward due to
2
their visually distinct features. As a result, the challenge of fully conclusive compared with the performance reported for
ship detection in high-resolution images primarily revolves HR and SAR imagery.
around maximizing the detection rate and closely aligns The objective of our work is to create an automated system
with the established problem of object detection in computer for detecting and characterizing ships in Sentinel-2 images.
vision, involving issues such as adjacent and small object
detection [11]. III. P ROPOSED APPROACH
We present a multi-stage approach for ship detection and
In medium-resolution imagery (10 m in our case), as illus- characterization in Sentinel-2 images. In the initial stage,
trated in Figure 1, ships, especially smaller ones of 50-meter- we employ a sliding window technique to systematically
long or below, the visual detection and characterization may cover the image. Each window is of size 100 × 100 pixels,
be complex: for instance, a large ship can resemble parts of a representing 1 km × 1 km, and overlaps with neighboring
platform; a small island may have a topology similar to a ship; windows by 25%. This overlap ensures that if a ship extends
a very small ship might appear almost identical to a bright spot across two windows, a signicant portion of it remains
on rough waters, or even be mistaken for a small cloud. As a detectable in at least one of them. A window is considered
result, a key challenge is to maximize the detection rate while valid if it contains at least 5% of sea pixels, determined using
minimizing false alarms. This may question how above cited the land/sea mask. Subsequently, these valid windows are
studies for HR images apply to MR optical imagery. categorized as either ”Ship” or ”No ship” using a Resnet-type
[22] classier. For the windows classied as ”Ship”, we
utilize a Faster R-CNN detector to obtain a bounding box
around the detected ships (only the coordinates of the ships
are necessary). This detector incorporates a dedicated branch
to estimate the ship’s heading. Figure 2 provides an overview
of our ship detection and heading estimation system.
Fig. 1: Illustration of complex ship images, presenting chal- When different adjacent patches detect parts of the same
lenges in visual detection and characterization. ship, we apply the non-maximum suppression (NMS)
algorithm. However, in our approach, the suppression is not
based on network condence but rather on the size of the
Only a limited number of studies have specically explored
bounding box. Our primary objective in this context is to
ship detection in MR imagery. Most of these works tend to
retain the largest bounding box to obtain the most precise
concentrate on specic aspects. For example, [14] delves into
ship coordinates. Finally, once the coordinates are identied,
ship detection and characterization in favorable conditions.
we create a 50 × 50 image centered on these coordinates to
[15] addresses the issue of scarce annotated ship data and
estimate the ship length using a Resnet-type network (see
introduces a ’self-supervised learning’ approach. Additionally,
Figure 3).
[16] introduces a method for identifying particular ship shapes
The motivation behind proposing a two-stage strategy is
associated with migrant activity. Few articles adopt a compre-
as follows. In current state-of-the-art schemes, candidate
hensive approach [17], which is our primary area of interest.
proposals are typically classied based on their internal
This scarcity of studies is, in part, attributable to the
features. Consequently, a region containing a very small
absence of freely available ship datasets. In contrast to ship
vessel may exhibit the same features as a region containing
detection at high resolution, which benets from an established
a portion of rough sea. The context becomes crucial in this
reference dataset [18], the only publicly available Sentinel-2
scenario, and the initial classication step is necessary to
ship datasets are [19] and [15]. These datasets comprise 31
capture the context within sufciently large patches that
(2000 ship exemplars) and 16 (1053 ship exemplars) Sentinel-
encompass the entire context while remaining small enough
2 images respectively, which may be limited to deploy state-
to avoid missing very small vessels.
of-the-art learning-based frameworks.
Only [17] addresses the detection of ships in medium- In our approach, each individual component undergoes
resolution satellite optical images within a relatively general distinct training phases, following a traditional methodology
framework. In this article, the analyzed images are sourced that includes training, validation, and testing. It is strongly
from both the Sentinel-2 mission satellites and the Planet recommended to include a substantial number of complete
Labs Dove satellite constellation. The images are divided into Sentinel-2 images during the model development phase. This
800 × 800 pixels patches, which are then used as input for approach not only allows for the identication and resolution
the Faster R-CNN [20] object detection model. The annotation of potential methodological limitations but also facilitates a
process relies on colocating the satellite images with AIS (Au- comprehensive examination of the test dataset’s distribution.
tomatic Identication System) data [21]. AIS data comprise
ship identiers and locations to groundtruth bounding boxes
in the considered dataset. As most small ships are not equipped
with AIS systems, this study only addresses large ships (> 100 A. Classication phase
m). Overall, this study reports a 85% detection rate but does Developing a ”Ship” and ”No ship” category classier
not document the false alarm rate. These results do not appear involves two main phases. Firstly, we focus on generating
3
Fig. 2: An Overview of Our Ship Detection and Heading Estimation System on Sentinel-2 Images. The Faster R-CNN model
comprises a backbone constructed from the Resnet18 classier trained on Sentinel-2 data. The layer names in black used to
build detection feature map are derived from keras applications [23].
Fig. 7: Our K-means method for forming the training, validation, and test sets.
our classier identies within patches labeled as ’Ship’. essential information required to serve as the backbone of
Additionally, it may eliminate some false alarms if no ship is our detector. While the use of the initial 25 × 25 feature
detected in the patch by the detector. To build our detection map from the classication network aids in proposing smaller
network, we employe the Faster R-CNN framework [20]. regions that may contain small ships, it remains relatively
Despite its introduction in 2015, Faster R-CNN, along with shallow for capturing information about larger vessels and
adapted versions, continues to demonstrate its state-of-the-art other objects within the image. To address this, we integrate
performance on established reference datasets [28],[29]. intermediate feature maps of sizes 13 × 13 and 7 × 7. By
However, it’s important to note that we do not explore other employing upsampling techniques, we combine these feature
detector types in this work, including YOLO and its variants maps to create the feature map for region proposals, ensuring
[30] and detection transformer (DETR) and its variants [31]. a comprehensive representation of potential objects of interest
[34].
We utilize a modied version of Faster R-CNN for detec- The region proposal network (RPN) architecture, after cre-
tion, based on the code referenced in [32], implemented on the ating the feature maps, remains unchanged from the original
TensorFlow [33] framework. The primary motivation for these Faster R-CNN. It consists solely of a convolutional layer (512
modications is to enhance the detection of small vessels. The channels) and two additional convolutional layers, one for
Faster R-CNN architecture can be divided into three key com- classication and the other for regression.
ponents: the Backbone network, the Region Proposal Network During the training of the RPN, an anchor is considered
(RPN), and the Detector network. Since we are dealing with positive only if its Intersection over Union (IOU) with a
a single object type, we can use only the RPN along with ground truth box exceeds 0.7. Conversely, it is labeled as
the backbone network to detect ships. However, by employing negative if its IOU falls below 0.3. Anchors falling within
the entire structure, we achieve signicantly improved results. this range are not used for training purposes. To address the
This improvement is attributed to the Detector network, which class imbalance issue, we employ a weighted loss function,
incorporates Fully Connected layers, rening the outcomes of which helps balance the small number of positive anchors per
the RPN, which relies solely on convolutional operations. image.
Our patches are sized at 100 × 100 pixels and contain After proposing the regions of interest, we apply ROI
sometimes very small vessels, representing just 1 pixel of pooling to transform these regions into a xed size of 6 × 6,
the image, and sometimes large ships, occupying more than which is determined based on the lengths of the ships in our
40 pixels. To accommodate this size range, we congure the datasets. Subsequently, these resized regions are forwarded to
Region Proposal Network (RPN) to propose different regions the detector network. The structure of our modied Faster R-
of interest, even for the smallest objects. To achieve this, we CNN is illustrated within the green rectangle in Figure 2.
utilize anchor boxes of various sizes based on the statistics Only the RPN part and the Detector part are trained with a
of the ship length in our dataset, we choose 4 × 4, 7 × 7, learning rate of 10−4 . The weights of the backbone layers
10 × 10, 14 × 14, 18 × 18, and 28 × 28 for square-shaped are frozen. We believe that since the classier performs
anchor boxes. Additionally, we maintain the original aspect exceptionally well on a large dataset containing both ”Ship”
ratios from the Faster R-CNN paper [20], which are 1, 0.5, and ”No ship” images, the network only needs to learn how
and 2. Furthermore, the input feature map for the RPN is set to express the precise position of the ship, which is what the
to a size of 25 × 25 (with a stride length of 4) to preserve RPN and Detector are designed to accomplish.
crucial information from very small vessels, especially in
challenging scenarios. This conguration allows the RPN to C. Characterization of ships
effectively handle objects of different sizes within our patches. In this section, we introduce the deep learning component
for the estimation of the heading of the ship by adding an
The feature map is constructed from the classication additional branch to the Faster R-CNN architecture. Addition-
network mentioned earlier. In fact, the classication net- ally, we use a ResNet type model to estimate the size of the
work demonstrates excellent results in distinguishing between ship.
”Ship” and ”No ship” images. As a result, we believe that 1) Headings: While it was feasible to train a bounding box
the feature maps generated by this network encompass the with the ship’s orientation, as demonstrated in [35], we have
6
access to a dataset containing 3000 ships annotated with their ships, low-density ship images, and cloud-covered scenes. In
heading. This information proves to be more valuable than a total, the 60 images collectively contain 878475 valid patches.
simple rotated bounding box for our specic task. A patch is considered valid if it contains at least 5% of sea
Concerning the training process, we keep all the weights of pixels. In total, the 60 images collectively contain around 1147
our previously discussed Faster R-CNN model frozen. Then, ships. The image annotations are carried out by CLS analysts.
we introduce an additional branch after the ROI pooling layer, It’s important to note that very small ships may not have
as illustrated within the sky blue rectangle in Figure 2, to han- been annotated due to the inherent difculty and resolution
dle ship orientation estimation. In this branch, we utilize both limitations. Our evaluation set is available in [36].
the cosine and sine of the ship’s angle as model outputs. This
approach is adopted to convey circular information effectively
to the network.
2) Length Estimation: Since regions of interest (ROIs) are
resized during the process, some size information is inevitably
lost. To address this issue, one approach is to recover size
information using ship orientation and the scaling factor.
However, given the current resolution, attempts to introduce an
additional branch for length estimation, similar to what we do
for ship heading, do not yield satisfactory results. Therefore,
we decided to employ a separate network for this purpose.
We utilize the Resnet50 network, with inputs sized at 50×50
pixels and centered on the ship. While smaller patches can be Fig. 8: Location of the 60 Sentinel 2 Images Constituting the
considered (46 since the biggest ship we can meet is 458), Evaluation Set.
during the deployment of the entire ship detection system, the
inputs for the length estimation model are constructed around
the center of the ship’s bounding box predicted by our detector.
B. Classication Experimentations
This bounding box may be slightly affected by ship wakes or
irregular ship shapes. Hence, we opt for this patch size, taking 1) Evaluating Distribution Verication Effectiveness: In
into account potential small errors in the detection phase. this experiment, we analyze and compare the results achieved
It is essential to note that for ensuring the robustness of by employing various data splitting methods. We utilize the
our approach, we systematically exclude all images featuring pre-trained Resnet152 network on the Imagenet dataset, as it
partial cropping and inadequately labeled lengths. However, exhibits the best performance. We examine three distinct data
we do not exclude images of ships partially obscured by splitting congurations:
clouds if their lengths are still measurable; otherwise, they are • In the rst approach, we implement a random split, as
excluded. This renement leads to a dataset consisting of 7460 depicted in the rst phase of the diagram 7. This split is
images, distributed randomly across training (80%), validation referred to as the ”Random split.”
(10%), and test (10%) sets. • In the second approach, we employ a split based on
subclasses, as shown in the second phase of the diagram
IV. E XPERIMENTS AND R ESULTS 7. This split is referred to as the ”Subclass split.”
• In the third approach, we create a split based on clusters
In this section, we present the outcomes of our system,
within each subclass, represented in the third phase of
along with experiments demonstrating the effectiveness of the
the diagram 7. This split is referred to as the ”Subclass
methods used in both the classication and detection phases.
k-means split.”
We commence by introducing the experiments and results of
the classication phase, followed by those of the detection We present the performance metrics for the test sets in
phase. Subsequently, we delve into the characterization phase, Table I, each comprising approximately 6500 images, with
and nally, we present the results of our complete ship 2500 ship images and 4000 no-ship images. The metrics used
detection system. To assess the performance of our system, to assess our classiers include the False Positive Number
we utilize a distinct dataset comprising 60 Sentinel-2 images, (FP) and Recall. We opted for FP instead of precision due to
herein referred to as the Evaluation set. It is essential to note the minimal occurrence of false positives. These metrics are
that this dataset is distinct from the training, validation, and compared across different condence thresholds: 0.50, 0.75,
test datasets used for training each component of the system and 0.95.
Table I shows comparative results; however, we cannot draw
conclusions without testing the methods on a common dataset:
A. Description of the Evaluation Set the evaluation set (constructed using 60 Sentinel-2 images, as
The evaluation set consists of 60 Sentinel-2 images captured previously mentioned).
from approximately 20 different Earth locations, as indicated Table II displays the results of the three distinct models,
by red stars in Figure 8, at various acquisition times. These evaluated on the Evaluation set. The FP metric represents
images encompass a wide range of scenarios, including coast- the aggregate count of false positives observed across all
lines, turbulent seas, artifacts, images with a high density of 60 evaluation images, encompassing approximately 877470
7
mainder of our results, we exclusively employ the dataset Faster R-CNN / ResNet18 70 66 45.7 30.9
MRS-Faster R-CNN / ResNet18 80.4 18.8 75.4 77.1
split derived from the k-means method. To minimize the MRS-Faster R-CNN / ResNet34 77.5 23.7 76.9 74.4
False Positive Number, we conduct extensive experiments with MRS-Faster R-CNN / ResNet152 78 27.8 75 73.1
various models pretrained on Imagenet. Table III summarizes TABLE IV: Performance Metrics for Different detection Mod-
the outcomes of these models in terms of False Positive els.
Number (FP) and Recall metrics.
Our MRS-Faster R-CNN demonstrates promising perfor-
Model / Threshold
0.5 0.75 0.95 mance in comparison to a standard version of Faster R-CNN.
FP Rec.(%) FP Rec.(%) FP Rec.(%) Even though Resnet152 and Resnet34 yield superior classi-
InceptionV3 36 97.6 25 97.0 11 95.0 cation results (Table III), the feature maps constructed from
Resnet18 40 97.0 28 96.9 13 94.0 Resnet18 produce the best outcomes. We hypothesize that the
Resnet34 33 97.5 24 97.2 9 94.8
Resnet152 25 98.1 16 97.5 8 95.9 comparatively shallower architecture of Resnet18 allows the
Ens. Resnet34/Resnet152 14 96.5 10 95.3 4 93.0 intermediate feature maps which are of a relatively large size
TABLE III: Evaluation of Various Pretrained Models’ Perfor- to encompass a richer set of information. It’s worth noting that
mance on the Test Dataset. utilizing a backbone constructed from Resnet152 or Resnet34
for detection will help minimize image processing time in our
system, albeit with a minor trade-off in performance.
Our aim is to create the optimal ensemble, prioritizing Detection and False Alarm rates are inuenced by two key
FP reduction while tolerating a slight drop in Recall. To factors. Firstly, our dataset comprises numerous small vessels
achieve this, we employ ensemble models and manually se- with wakes, leading to instances where wakes are considered
lected aggregation rules, guided by a criterion: False Positives as part of the ship due to resolution limitations. However, our
should not include any images clearly devoid of vessels. detector consistently distinguishes between wakes and vessels
Consequently, we choose Resnet152 and Resnet54 with an on larger ships, as depicted in Figure 10. Secondly, our model
aggregation rule stipulating that both models must output encounters challenges in detecting closely positioned ships,
a condence score exceeding 0.95 to classify an image as particularly small vessels, as illustrated in Figure 11. Rening
containing a ship. the Intersection over Union (IOU) condition to 0.25, Detection
The Ensemble network yields a Recall of 93% with only rate reaches up to 95%, and False alarm rate decreases to 5%.
4 false positives on the test set. Figure 9 illustrates images
of these four false positives. It is not evident whether these 2) Direct application of the Faster R-CNN: In this exper-
images contain a ship. iment, we directly apply our detector to the evaluation set,
8
Image Name Main Features Total Ships DR(%) FA Potential Ships, Not Clear
S2A MUL ORT 13 20220521T084745 20220521T084745 00064 Coast, Calm sea 15 86.6 1 0
S2A MUL ORT 13 20220521T084756 20220521T084756 00064 Coast, Calm sea 13 100 0 7
S2A MUL ORT 13 20220702T063238 20220702T063238 00091 Clouds 0 . 0 0
S2A MUL ORT 13 20220707T083745 20220707T083745 00021 Coast, Calm sea, Harbor 140 94.2 6 3
S2A MUL ORT 13 20220712T063233 20220712T063233 00091 Clouds 0 . 0 0
S2A MUL ORT 13 20220719T074600 20220719T074600 00049 Coast, Islands 18 100 0 13
S2A MUL ORT 13 20220719T074613 20220719T074613 00049 Coast, Islands, Clouds 26 100 4 20
S2A MUL ORT 13 20220719T074628 20220719T074628 00049 Coast, Islands, Clouds, Underwater re- 15 93.3 4 3
lief
S2A MUL ORT 13 20220719T074638 20220719T074638 00049 Coast, Islands, Clouds, Underwater re- 8 100 4 3
lief, Artifacts
S2A MUL ORT 13 20220801T063228 20220801T063228 00091 Small Clouds 0 . 0 0
S2A MUL ORT 13 20220801T063229 20220801T063229 00091 Small Clouds, Islands 0 . 0 0
S2A MUL ORT 13 20220801T063235 20220801T063235 00091 Small Clouds, Islands 0 . 0 0
S2A MUL ORT 13 20220808T062238 20220808T062238 00048 Clouds 0 . 0 0
S2A MUL ORT 13 20220818T062238 20220818T062238 00048 Clouds 1 0 0 0
S2A MUL ORT 13 20220821T063236 20220821T063236 00091 Clouds 0 . 0 0
S2A MUL ORT 13 20220828T062238 20220828T062238 00048 Complete Cloud Cover 0 . 0 0
S2A MUL ORT 13 20220908T004106 20220908T004106 00059 Coast, Small Clouds, Underwater relief 14 85.7 0 2
S2A MUL ORT 13 20220908T004119 20220908T004119 00059 Coast, Small Clouds, Underwater relief 0 . 0 1
S2A MUL ORT 13 20220908T004130 20220908T004130 00059 Coast, Small Clouds, Underwater relief 16 81.3 5 0
S2A MUL ORT 13 20220910T063231 20220910T063231 00091 Calm Sea, Small Clouds 0 . 0 0
S2A MUL ORT 13 20220910T063234 20220910T063234 00091 Clouds 0 . 0 0
S2A MUL ORT 13 20220912T220946 20220912T220946 00129 Clouds, Swell 0 . 7 0
S2A MUL ORT 13 20220915T034601 20220915T034601 00018 Clouds, Platforms 14 100 2 5
S2A MUL ORT 13 20220915T034604 20220915T034604 00018 Clouds, Platforms 15 100 1 5
S2A MUL ORT 13 20220915T034630 20220915T034630 00018 Clouds 3 100 0 0
S2A MUL ORT 13 20220915T222001 20220915T222001 00029 Clouds 0 . 0 0
S2A MUL ORT 13 20220918T035449 20220918T035449 00061 Clouds 40 90 6 3
S2A MUL ORT 13 20220918T035521 20220918T035521 00061 Coast, Clouds 8 75 0 0
S2A MUL ORT 13 20220918T035607 20220918T035607 00061 Coast, Clouds, Islands 20 80 4 2
S2A MUL ORT 13 20220918T035616 20220918T035616 00061 Coast, Clouds 17 70 2 0
S2A MUL ORT 13 20220918T035622 20220918T035622 00061 Coast, Clouds, Islands 97 92.7 1 4
S2A MUL ORT 13 20220920T063234 20220920T063234 00091 Clouds 0 . 0 0
S2B MUL ORT 13 20220509T085722 20220509T085722 00107 Coast, Calm Sea 31 100 4 3
S2B MUL ORT 13 20220704T062230 20220704T062230 00048 Clouds 0 . 0 0
S2B MUL ORT 13 20220709T083234 20220709T083234 00121 Coast, Clouds, Underwater relief 69 92.7 6 1
S2B MUL ORT 13 20220709T083248 20220709T083248 00121 Coast, Calm Sea 13 100 1 1
S2B MUL ORT 13 20220710T145840 20220710T145840 00139 Coast, Clouds, Artifacts 8 100 2 2
S2B MUL ORT 13 20220717T063228 20220717T063228 00091 Clouds 0 . 0 0
S2B MUL ORT 13 20220718T155111 20220718T155111 00111 Coast, Clouds 0 . 2 0
S2B MUL ORT 13 20220719T083234 20220719T083234 00121 Coast, Calm Sea, Underwater relief 87 91.9 1 2
S2B MUL ORT 13 20220719T083238 20220719T083238 00121 Calm Sea, Underwater relief, Platforms 147 91.8 10 11
S2B MUL ORT 13 20220722T070213 20220722T070213 00020 Coast, Clouds, Swell 129 96.8 34 5
S2B MUL ORT 13 20220727T063228 20220727T063228 00091 Clouds 0 . 0 0
S2B MUL ORT 13 20220813T062230 20220813T062230 00048 Clouds 0 . 0 0
S2B MUL ORT 13 20220816T063227 20220816T063227 00091 Clouds 1 0 0 0
S2B MUL ORT 13 20220828T083001 20220828T083001 00121 Coast, Harbor, Calm Sea 107 97.1 9 3
S2B MUL ORT 13 20220828T083030 20220828T083030 00121 Coast, Clouds 31 74.1 3 0
S2B MUL ORT 13 20220909T005948 20220909T005948 00002 Small Clouds 0 . 0 0
S2B MUL ORT 13 20220909T005951 20220909T005951 00002 Small Clouds 1 100 1 0
S2B MUL ORT 13 20220909T005956 20220909T005956 00002 Clouds, Islands 7 57 2 0
S2B MUL ORT 13 20220909T005959 20220909T005959 00002 Calm Sea, Clouds 5 100 0 0
S2B MUL ORT 13 20220909T010014 20220909T010014 00002 Calm Sea, Clouds 5 80 0 0
S2B MUL ORT 13 20220909T010028 20220909T010028 00002 Coast, Calm Sea 1 100 0 2
S2B MUL ORT 13 20220912T062227 20220912T062227 00048 Clouds 0 . 0 0
S2B MUL ORT 13 20220916T040541 20220916T040541 00104 Coast, Clouds 4 50 2 1
S2B MUL ORT 13 20220916T040545 20220916T040545 00104 Clouds, Islands 0 . 0 2
S2B MUL ORT 13 20220916T040610 20220916T040610 00104 Coast, Clouds 11 90 2 1
S2B MUL ORT 13 20220916T040624 20220916T040624 00104 Clouds 3 100 1 0
S2B MUL ORT 13 20220916T040628 20220916T040628 00104 Clouds 1 100 1 1
S2B MUL ORT 13 20220916T040634 20220916T040634 00104 Clouds 6 100 0 0
TABLE V: Detection System’s performance on the Evaluation Set [36] with Detection Rate and False Alarm Count metrics.
V. D ISCUSSION
Fig. 15: Examples of Ambiguous Heading. The detection of ships in medium resolution optical images
remains an underexplored eld, with only a few studies
adopting a comprehensive approach to this task. The scarcity
to instances where wake shapes or partial cloud coverage of research in this area is not indicative of its insignicance
make the length measurement task challenging (Figure 17). By but rather stems from the limited availability of data and
excluding these challenging instances, the dataset now consists annotated resources. In an attempt to bridge this gap, we
10
Fig. 18: Integration of the Ship Detection System in CLS Processing Pipeline - Initial System Trials.
Vincent Kerbaol for his pivotal support, making this project [12] G. Tang, S. Liu, I. Fujino, C. Claramunt, Y. Wang, and S. Men, “H-
possible. We express our gratitude to the CLS analysts for yolo: A single-shot ship detection approach based on region of interest
preselected network,” Remote Sensing, vol. 12, no. 24, 2020.
their meticulous annotation of the datasets. [13] S. Karki and S. Kulkarni, “Ship detection and segmentation using unet,”
Special recognition is extended to Etienne Gauchet, Mo- in 2021 International Conference on Advances in Electrical, Computing,
hammed Benayad, and Maxime Vandevoorde for their contri- Communication and Sustainable Technologies (ICAECT), pp. 1–7, 2021.
[14] H. Heiselberg, “A direct and fast methodology for ship recognition in
butions to data cleaning and generation. We offer a heartfelt sentinel-2 multispectral imagery,” Remote Sensing, vol. 8, no. 12, 2016.
thanks to Chaı̈mae Sriti and Aurélien Colin for their compre- [15] A. Ciocarlan and A. Stoian, “Ship detection in sentinel 2 multi-spectral
hensive review of the paper and the insightful comments they images with self-supervised learning,” Remote Sensing, vol. 13, no. 21,
2021.
provided. [16] U. Kanjir, “Detecting migrant vessels in the mediterranean sea: Using
sentinel-2 images to aid humanitarian actions,” Acta Astronautica,
vol. 155, pp. 45–50, 2019.
R EFERENCES [17] D. Štepec, T. Martinčič, and D. Skočaj, “Automated system for ship
detection from medium resolution satellite optical imagery,” in OCEANS
[1] EMSA, “Emsa outlook 2023,” 2022. 2019 MTS/IEEE SEATTLE, pp. 1–10, 2019.
[2] A. Iodice and G. Di Martino, Maritime Surveillance with Synthetic [18] S. Kızılkaya, U. Alganci, and E. Sertel, “Vhrships: An extensive
Aperture Radar. Institution of Engineering and Technology, 2020. benchmark dataset for scalable deep learning-based ship detection
[3] V. W. G. H, Satellite Imaging for Maritime Surveillance of the European applications,” ISPRS International Journal of Geo-Information, vol. 11,
Seas. Berlin (Germany): Springer Science+Business Media B.V., 2008. no. 8, 2022.
[4] D. J. Crisp, “The state-of-the-art in ship detection in synthetic aperture [19] F. D. Vieilleville, A. Lagrange, N. Dublé, and B. L. Saux, “Sentinel-2
radar imagery,” tech. rep., Defence Science and Technology Organisation dataset for ship detection,” Mar. 2022.
Salisbury (Australia) Info . . . , 2004. [20] S. Ren, K. He, R. B. Girshick, and J. Sun, “Faster r-cnn: Towards
[5] L. Zhai, Y. Li, and Y. Su, “Segmentation-based ship detection in real-time object detection with region proposal networks.,” in NIPS
harbor for sar images,” in 2016 CIE International Conference on Radar (C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett,
(RADAR), pp. 1–4, 2016. eds.), pp. 91–99, 2015.
[6] M. Stasolla, J. J. Mallorqui, G. Margarit, C. Santamaria, and N. Walker, [21] A. Milios, K. Bereta, K. Chatzikokolakis, D. Zissis, and S. Matwin, “Au-
“A comparative study of operational vessel detectors for maritime tomatic fusion of satellite imagery and ais data for vessel detection,” in
surveillance using satellite-borne synthetic aperture radar,” IEEE Journal 2019 22th International Conference on Information Fusion (FUSION),
of Selected Topics in Applied Earth Observations and Remote Sensing, pp. 1–5, 2019.
vol. 9, no. 6, pp. 2687–2701, 2016. [22] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
[7] B. LI, X. XIE, X. WEI, and W. TANG, “Ship detection and classication recognition,” in 2016 IEEE Conference on Computer Vision and Pattern
from optical remote sensing images: A survey,” Chinese Journal of Recognition (CVPR), pp. 770–778, 2016.
Aeronautics, vol. 34, no. 3, pp. 145–163, 2021. [23] F. Chollet et al., “Keras applications.” https://keras.io/api/applications/,
[8] S. Voinov, D. Krause, and E. Schwarz, “Towards automated vessel 2015.
detection and type recognition from vhr optical satellite images,” in [24] D. Hendrycks, K. Lee, and M. Mazeika, “Using pre-training can improve
IGARSS 2018 - 2018 IEEE International Geoscience and Remote model robustness and uncertainty,” CoRR, vol. abs/1901.09960, 2019.
Sensing Symposium, pp. 4823–4826, 2018. [25] J. MacQueen, “Some methods for classication and analysis of mul-
[9] W. Chen, B. Han, Z. Yang, and X. Gao, “Mssdet: Multi-scale ship- tivariate observations,” Proceedings of the 5th Berkeley Symposium on
detection framework in optical remote-sensing images and new bench- Mathematical Statistics and Probability, Volume 1: Statistics, University
mark,” Remote Sensing, vol. 14, no. 21, 2022. of California Press, Berkeley, pp. 281–297, 1967.
[10] U. Kanjir, H. Greidanus, and K. Oštir, “Vessel detection and classi- [26] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet:
cation from spaceborne optical images: A literature survey,” Remote A large-scale hierarchical image database,” in 2009 IEEE conference on
Sensing of Environment, vol. 207, pp. 1–26, 2018. computer vision and pattern recognition, pp. 248–255, Ieee, 2009.
[11] S. Zhang, R. Wu, K. Xu, J. Wang, and W. Sun, “R-cnn-based ship [27] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna,
detection from high resolution remote sensing imagery,” Remote Sensing, “Rethinking the inception architecture for computer vision,” CoRR,
vol. 11, no. 6, 2019. vol. abs/1512.00567, 2015.
12