2019 Thesis El Emrani
2019 Thesis El Emrani
2019 Thesis El Emrani
Geoinformatics
Master thesis
I declare that the work presented in this dissertation report entitled “Flood Detection
with a Deep Learning Approach Using Optical and SAR Satellite Data”, submitted to
IPI institute, Leibniz University Hannover, for the award of the degree, Master of Science
in Geodesy and Geoinformatics is my original work. I have not plagiarized or submitted
the same work for the award of any other degree.
Hannover, 29.11.2019
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page i
2 Abstract
This study presents an approach to automatically detect flood regions using satellite
data. This proposed method is applicable for urban and bare soil areas. For this
purpose, data from two sensors with different modalities are used. The first dataset is
radar data acquired from Sentinel-l A/B satellites that provide complex data in the two
co- and cross polarizations (VV and VH). The second dataset is optical data acquired
from Sentinel-2 A/B satellites that provide multispectral data of 13 bands ranging from
visible to short infra-red parts of the electromagnetic spectrum. The use of radar data
is advantageous given that it is independent of any kind of weather conditions unlike
optical data that are not able to acquire data in cloudy regions. Moreover, radar data
can be useful for flood mapping because water bodies appear as dark areas due to the
backscattering effect of water. On the other hand, water surface reacts differently in
terms of reflection and absorption in different wavelengths from visible to shortwave
infra-red. Therefore, multispectral optical data are able to recognize water bodies.
Detecting floods in a fast and precise way is crucial as it helps in improving crisis
management and consequently reducing damages of the natural disaster phenomenon.
Deep neural networks are demonstrating improvement in their ability to handle big
data to implement variety of tasks such as object detection, change detection, and
object classification. In this study, we use supervised deep learning network, that is
considered as one of the latest trending methods in remote sensing, to map flood areas
using radar and optical satellite data. Specifically, we use Siamese neural networks to
detect the water bodies resulting from flood events by applying simultaneously semantic
segmentation over pre- and post-event images to produce feature maps. The difference
between the feature maps derived from pre- and post-event images reveals the flood
region. The supervised proposed approach is trained using reference data that was
created in this study using the optical and radar data of two different flood regions in
Iran. Part of the data was kept for testing, and the results show that the developed
methods could detect flood areas with an accuracy around 94.67% using optical data
and an accuracy of around 81%.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page ii
Contents
1 Undertaking i
2 Abstract ii
3 Introduction 1
4 Theoretical Background 3
4.2.2 U-net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.1 U-net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5.2 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
6 Experimental Settings 19
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page iii
6.1.2 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
7 Experimental Results 30
7.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8 Conclusion 37
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page iv
List of Figures
4 A single node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 A U-net architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7 Overall network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
14 Optical data based network training and validating accuracies and losses
(learning) curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
15 SAR data based network training and validating accuracies and losses
(learning) curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page v
19 Flood detection results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
List of Tables
6 Evaluation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page vi
3 Introduction
The recent years have seen an unpreceded soar of floods around the world. Accord-
ing to [13], inundations have increased in Central Europe by a factor of two since 1980.
These floods cause massive economic damages not only to governments, but also indi-
viduals and private institutions like insurance companies. For this reason, concerned
entities encourage research and development set ups to find new methods to better
serve the community and those entities themselves to improve services provided when
this kind of phenomena occurs for better crisis response and mitigation. Over recent
years, the accessibility of satellite remote sensing enabled the researchers to use the
tool in monitoring natural disasters like hydrological disasters and earthquakes. The
free access to these products supported users of different background to further extend
their investigations with these huge amounts of data. Currently, satellite data can be
considered as an effective tool to estimate disaster damages and enhance catastrophe
risks because of their sensors’ diverse modalities, and their huge volume with various
space, time, and spectral resolutions. This data availability allowed the creation of
serves that serve for the development of automated or semi-automated methods that
serves quick flood mappings. Theses methods have some uncertainties since they are
not checked and are rapidly produced. More accurate flood mapping methods are used
in case of emergencies like the use of un-manned aerial vehicles photos or field surveys.
Nonetheless, techniques are time consuming, costly, and sometimes even not possible
to make in bad weather conditions on time. Consequently, new procedures were intro-
duced in remote sensing to raise the current challenges. Neural networks became the
new trend in remote sensing because of their success in many computer vision tasks.
The contributions of this thesis are the following:
• Studying and highlighting the problems encountered in flood mapping using Syn-
thetic Aperature Radar (SAR) and optical imagery acquired over urban and non-
urban areas
• Two end-to-end trainable neural networks for flood detection are proposed using
optical and SAR satellite data respectively. The proposed methods also attempt
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 1
to map only water bodies resulting from a flood event and not permanent water
surfaces.
Flood mapping is a challenging task using traditional methods due to the lack of
flood data during floods. Airborne flights have work limitations because of bad weather
conditions that follow habitually a flood event. More importantly, it is usually very
expensive to produce a flood map within a period of time that is effectively short. Hence,
the need to build a method that is reliable and cost effective is the motivation behind this
work. The potential market for this solution includes public environmental management
agencies, governmental authorities, and insurance companies. Flood mapping is one of
the applications where satellite data can be valuable and it will further help in risk
management, damage assessment and providing more information to rescuers during
flooding. Specifically, optical and SAR satellite data can be used to detect and map
floods.
SAR data is useful because measurements are taken regardless of weather conditions
and lightening conditions. The sensors of these products are active sensors which use
their own energy to send signals to make observations over a space.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 2
old. Factors to be considered to find the threshold are spatial resolution of the satellite
image, presence of shadow, and the ground area. The last factor in particular requires
the calibration of threshold values if different mapped are changes, and limits the gener-
alization of the method. The authors in [] presented several semi-automatic and manual
methods to map inundations, by exploiting free satellite multispectral and SAR data.
Because of the success encountered in various research areas, remote sensing is sim-
ilarly starting to use DNNs as a methodology for many of its applications. However,
remote sensing data presents some new challenges for deep learning. For example, in
this paper, the datasets used are from optical and SAR sensors and hence, multimodal.
4 Theoretical Background
The use of earth observation data from space can be very useful as it provides
valuable and timely information when investigating emergencies such as floods. Another
advantage of satellite observation is the ability to acquire large area data even from
hard-to-reach regions or during natural disasters like floods. Optical and SAR sensor
modalities are under focus of this study.
This section will further give details about SAR and optical data acquired by Sentinel
1 and Sentinel 2 respectively. The datasets used in this thesis are derived from processed
image products from the latter satellite missions and they can be described by the
following characteristics:
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 3
truth data, and methodology selection for flood mapping.
• StripMap (SM)
The ground swath is illuminated with a continuous sequence of pulses while the
antenna beam is pointed to a fixed angle in azimuth and an approximately fixed
off-nadir angle.
• Wave (WV)
Several vignettes are acquired exclusively in either VV or HH polarization, and
each vignette processed as a separate image.
Sentinel 1 satellite sensors measure the radar strength of the returned signal and
the time it takes for a round trip of the signal to get the range location and brightness
(or amplitude) of the pixels. One of the major reasons why SAR data is used in
flood detection is due to the ability of microwave instruments of cloud-penetrating
and day-and-night operational features. Another valuable advantage of SAR data is
microwave backscattering. The backscattering coefficient is the square of the amplitude
of the complex signal, and it is affected by surface roughness. Water filled areas are
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 4
Table 1: Some of Sentinel 1 characteristics
Sentinel 1
Resolution 5x5 m SM mode
5x20 m IW mode and WV mode
25x100 m EW mode
Orbit 693 km
Inclination 98.18◦
Sun-synchronous
Repetition rate 12 days
Swath 80 Km
250 Km
400 Km
Processing levels Level-0 Raw for SM, IW, EW modes
Level-1 SLC for SM, IW, EW , WV modes
Level-1 GRD for SM, IW, EW , WV modes
Level-2 OCN for SM, IW, EW , WV modes
Level-0 WV (not released to users)
Polarization Single polarization (HH or VV) for SM, IW, EW , WV modes
Dual polarization (HH+HV or VV+VH) for SM, IW, EW modes
On the other hand, areas like urban cities are affected by the double bounce effect,
which results of having two smooth areas or surfaces forming a right angle facing the
radar beam. Consequently, the beam bounces twice off the surfaces and the main radar
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 5
energy is reflected back to the sensor. Figure 2 illustrates the double bounce effect.
Sentinel 2 is an earth observation mission from ESA launched in June 2015, with two
satellites S2A and S2B phased at 180◦ to each other [3]. Sentinel 2 satellites orbits are
sun-synchronous, so the angle of the sun upon the surface of earth is maintained. Basic
information about the satellites of the mission is shown in table 2, while table 3 gives an
insight about the 13 bands captured by the satellite’s Multi-Spectral Instrument (MSI)
[3]. The optical sensor of the mission captures images with the behavior described in
figure 3. The imaging of an object is affected by the sensor incidence angle θ, the sun
elevation angle, and the height and size of the object. The optical sensor of Sentinel 2
samples 13 spectral bands as shown in table 3.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 6
Table 2: Some of Sentinel 2 characteristics
Sentinel 2
Resolution 10 m (4 bands)
20 m (6 bands)
60 m (3 bands
Orbit 786 km
Inclination 98,1◦
Sun-synchronous
Repetition rate 10 days for each single SENTINEL-2 satellite
5 days at the Equator for the combined constellation
Area covered 100x100Km2
Radiometric Resolutions 12 bit by MSI
Processing levels Level-1C and Level-2A
Level-0, Level-1A and Level-1B (are not released to users)
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 7
Figure 3: Imaging of a building by and optical sensor
The ability of Sentinel 2 mission to obtain data in a multitude of bands with different
wavelength helps in the process of identifying inundated regions using the reflectance
property of water in those bands. Thus, the optical data can be used for flood detection
with water indices or automated algorithms like DNNs. The most famous indices for
calculating water indices are the Normalized Difference Water Index (NDWI) and the
Modified NDWI (MNDWI). NDWI is calculated using the green band and the near-
infrared band, respectively bands 3 and 8, as the following
(Band3 − Band8)
N DW I = (1)
(Band3 + Band8)
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 8
following:
(Band3 − Band12)
N DW I = (2)
(Band3 + Band12)
After the calculation of the indices, a threshold is applied to map water pixels and
non-water pixels.
The use of machine learning techniques, in particular deep learning, is becoming in-
creasingly important. This tool has proven to be both powerful and efficient in multiple
application fields including remote sensing. It has also been nominated as one of the
ten breakthrough technologies of 2013 [13]. Deep learning uses neural networks (NN) to
get high level features from raw input. The neural network is formed of multiple layers,
which model nerve cells or neurons of the brain. To be called a deep neural network,
a neural network should involve more than two hidden layers between the input layer
and output layer.
The core unit of a neural network is the node or neuron. The earliest NNs were
based on a perceptron [2], which is mathematically modeling the work of a biological
brain neuron as shown in figure 4.
The node uses a function f that takes the inputs given by other nodes connected
to it and calculates with a mathematical function the output to eventually transfer
it to another node. The nodes are connected to each other with edges. Each edge
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 9
has a weight w, as seen in figure 4, that is used by function f to do its computation.
Biases, like the bias b in figure 1, can also be added depending on the application of
NN. Multiple nodes form a layer and two or more connected layers form a deep NN
(DNN). In this thesis, the interest is focused on one type of neural network which is
deep feedforward neural networks. These NNs do not include loops or cycles, and their
layers are arranged in the following order: an input layer, two or more hidden layers,
and an output layer.
The input layer nodes are given the data the user feeds to the network, and they
subsequently pass it to the next layer, without performing any computations. Next, the
hidden layer takes as input what is forwarded to it from the input layer and performs
computations on it before forwarding it to the next layer. The performed calculations
are mathematical functions, or activation functions, that introduce non-linearity into the
NN. Non-linearity in the NN is crucial as it allows to model a response variable (target
variable, class label, or score) that varies non-linearly with its explanatory variables, as
the input of the NN is non-linear. The hidden layers are not connected with the outside
world, and the output of a hidden layer is passed only to the next hidden layer or to
the output layer. The final layer, or output layer, gets as input the output from the
previous layer and forwards it to the user.
The task of the DNN used in this project is to classify pixel of the input image into
“flood” pixel or “non-flood” pixel. Given that the model is trained with data containing
image pairs and their corresponding flood masks, the learning of the DNN is supervised.
The goal is to DNN is to arrive to the point of least error as soon as possible. A key
concept in the learning process of the network is the calculation of the error between the
predicted output of the network and the ground-truth labels, or masks in this case. The
error mathematical expression is called the loss function. As stated previously, the edges
have weights and biases can also be added to the activation function. Therefore, these
parameters are part of the model and need to be adjusted accordingly with respect to
the error of the loss function. The process of optimizing the weights and biases enables
the NN to make better guesses or outputs as the parameters are adjusted according
to their contribution to the overall error. The process of scoring the input, calculating
the loss, and updating the parameters of the model is repeated until the error can no
longer be reduced. The selection of the loss function depends on each application. For
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 10
cases where the output should have two classes, as it is the case for flood mapping with
classes [flood, non-flood], one of the most used loss functions is the binary cross-entropy
or log loss. Its mathematical expression is:
N
1 X
Hp (q) = − yi .log(p(yi )) + (1 − yi ).log(1 − p(yi )) (3)
N
i=1
where y is the label or true value and p(y) is the predicted probability of the label
being flood or non-flood for all N pixels or input. Various feedforward NN architectures
were introduced for various applications, like Autoencoder models, Convolutional NN
models (CNN), Recurrent NN models, and Recursive NN models [3]. 4.2.1 describes
CNNs. 4.2.2 and 4.2.3 describe respectively two architectures using CNNs.
The feedforward NN example considered in this study is a CNN. CNNs are heavily
used in computer vision and image analysis applications, as they were proven to be
very efficient in classification problems. They have also been used in comparing images
in different contexts [19, 19, 5]. The four essential operations that define CNNs are
convolution, non-linearity, and pooling [15]. Each operation will be discussed explicitly
to understand how CNNs work.
n
X
y= wij ∗ xi + bj (4)
i=1
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 11
where * is a two-dimensional discrete convolution operator.
The activation function, or the second characteristic CNN operation, adds the non-
linearity to the model because filter (convolution operation) and matrix operations are
linear operations and they do not reflect the non-linearity of the raw data. The math-
ematical functions used as activation functions can vary depending on the application
purpose, and some examples that are widely used are Sigmoid function, Tanh function,
and Rectified Linear Unit function. A sigmoid function takes as input a real-valued
number and maps it into a range between 0 and 1. Consequently, large negative num-
bers become 0 and large positive numbers become 1. The hyperbolic tangent function
takes as input a real-valued number and maps it into a range between -1 and 1. Rectified
Linear Unit (ReLU) function is linear for all positive values, and zero for all negative
values. It is defined as:
f (x) = max(0, x) (5)
Some parameters affect the size of the feature map and the main ones are the
padding, the depth of filters, and the stride [8]. The padding is a parameter that helps
avoid outputs shrinking and information loss on the corners of an image or feature
map. When adding zero padding to the border of an input image matrix, the filter
can be applied to the border elements too. The second parameter is depth, and it is
the number of filters applied. Therefore, if 3 filters were applied to an image with one
channel, the result would be 3 feature maps stacked as a matrix with depth of 3. The
stride parameter sets how many pixels or input elements are skipped while sliding the
applied filter. Thus, the future maps size decreases when the number of skipped pixels
increases.
The third operation that defines CNNs is pooling. This operation enables sub-
sampling of the output of the lower layer to reach translational invariance. The most
significant information is kept while reducing the dimension of the lower layer. Apply-
ing pooling does not prevent the network from learning about slight changes in images
because these changes or maxima are considered as significant since they are maxima
within their respective ranges, so they are kept. It is also good to mention that pooling
similarly reduces the number of computations done on the network, which regulates
overfitting [14].
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 12
The CNN architecture can include at the end of it a fully connected layer used
as a classifier, but it is not a characteristic operation of CNNs. Each node in the fully
connected layer is connected to all nodes in the lower layer and calculate the probability
of classes based on the high-level features forwarded to it by its input.
4.2.2 U-net
CNNs are also used for semantic segmentation applications. The task is achieved
by assigning an object class to each individual pixel in the input image. A widely used
architecture of CNNs in semantic segmentation is the U-net architecture [16]. While
there are many architectures presented to do the same task, U-nets are networks that
can be used when the dataset is small with excessive data augmentation techniques.
Semantic segmentation with U-nets is mainly done by down-sampling of an image for
feature extraction, then up-sampling using deconvolutional layers, to construct a pixel-
wise classification labeled map. A deconvolution layer is simply a layer that where
deconvolution operation is applied, or in other words, the transpose of a convolution
operation is applied [11]. The U-net architecture consists of an encoder joined to a
symmetrical decoder [11].
The encoder is a convolutional network and its task is to wrap the spatial dimension
of the input image into a lot of meaningful and significant features. For this purpose,
the encoder applies convolution operations followed by max pooling or down-sampling
operation to encode the input image into a matrix of feature representations at dif-
ferent levels. The extracted features from the input image become more abstract and
subsequently have more semantic information. This architecture is different from the
CNN that discards all spatial information to produce high-level semantic class labels for
object recognition applications. CNNs used for change detection need to preserve the
spatial information to predict pixel-wise semantic changes. Thus, a decoder is essential
for the change detection architecture.
The decoder is the deconvolutional network and it serves to extract these sets of
features to construct a segmented or labeled representation of the input image. The de-
coder applies up-sampling operation, concatenation, then convolution operations. The
concatenation in U-net is done by copying the features from layers in the encoder and
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 13
concatenating them with the corresponding layer (of same level) in the decoder part.
This method ensures the preservation of pixels positions after convolution and max
pooling operations. Consequently, U-nets can be used for change detection between two
images from different time periods.
To combine what was described about the components of a U-net, figure 5 shows
the structure of a U-net architecture with the main discussed operations. The input of
the U-net in the figure is an image of size 256x256 with 3 channels.
Change detection is the procedure of finding the relevant changes by observing the
object or area at different timeslots. Change detection is one of the main applications of
remotely sensed data from satellites, for the reason that repetitive coverage is at short
intervals and image quality is consistent [18]. Flood detection can be handled as a change
detection problem. The specific assumption to be taken is that the detected change is
related to change of water surfaces only. For this study, the semantic segmentation
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 14
is relevant since it can be adopted for change detection, and U-net is an appropriate
architecture to use in this case. Because the U-net takes one input, a Siamese NN
(SNN) will be used to get features from an image pair.
SNN is a network that has two branches with the same architecture and shared
weights. Each branch takes one image input from the image pair, and the output of
both branches is combined to give the labeled map. The choice of using SNNs is not
only done for the need of feeding an image pair, but also because this methodology
guarantees that feature extraction from the image pair is performed with the same
approach, since both branches have the same weights and the image pairs have the
same type of sensor and similar characteristics.
Although statistical or rule-based methods are widely used for detecting water and
floods, there exist limitations when using these methods. The most obvious one is the
selection of the appropriate threshold, as it can be a difficult task if the area covered
by flood is characterized by ground variations (i.e., urban and non-urban areas). Other
factors include the presence of shadows due to the illumination from the sun and at-
mospheric changes when using optical data. Therefore, these changes can affect the
flood detection method if not taken into consideration. Selecting and preprocessing of
datasets before using is necessary in this case.
As previously stated, satellite data can be very useful in case of flood events because
of measurements repetition rate, the constant resolution, and its accessibility. The use
of optical and SAR data is an interesting approach to compare the performance of using
them in this kind of situations. They both have advantages and disadvantages so using
both of them can be advantageous to compensates each other’s downfalls.
SAR labeled data for floods is rarely found mainly because of its less intuitive appear-
ance compared with optical data due to its speckle noise [10]. Therefore, the labeling
task becomes more challenging to do manually. Nonetheless, it proves to be beneficial
in the task of detecting water simply because the variation of the backscattering of the
signal with respect to the object or surface it senses, as previously stated in Chapter
2 (Sentinel 1 section). In addition, its independence from atmospheric and sunlight
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 15
conditions can be advantages in situation of cloudy and bad weather during a flood
event. The final input SAR data that will be used by the proposed algorithm consists
of the coherence map of the study region and the measurements of the sigma nought
values from the converted pixel values in digital numbers from two different polariza-
tions. These 3 products will be extracted for two time periods: one before the flood
event and one after the flood event.
Optical data, on the other hand, is dependent on weather conditions and cannot
take measurements during nights. Consequently, the weather needs to be free of clouds
to sense flood areas. If it is the case, then optical data would be easier to label manually,
compared with SAR data. In addition, there is no speckle noise effect on the data. More
importantly, its ability to acquire images in various wavelength bands is useful for water
detection.
The satellite sensor captures RGB bands which are widely used bands in classifica-
tion problems. This study focuses on using another combination of Sentinel 2 bands,
which is a combination of band 3, band 8, and band 12. The motivation behind using
those bands is because they are the three bands having high reflectance for pixels with
clear or turbid water. Band 12 is also used because it will help to enhance the contrast
between the urban land and water land, as water has a strong absorbability and great
radiation in this band [20]. Figure 6 shows the reflectance of clear and turbid water
with respect to the wavelengths of band 3 and band8.
Figure 6: Reflectance of clear and turbid water with respect to the wavelengths of band
3 and band8
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 16
4.4 Conclusion
The second chapter gives a definition of concepts and some theoretical background
about the knowledge needed for the description of developed approach for the thesis
problematic. It also explains the motivation behind choosing to work with SAR datasets
of Sentinel 1 satellite and optical datasets of Sentinel 2 satellite for flood detection using
SNNs.
In this chapter, the proposed model is presented. The proposed identical models to
map floods are built from scratch. Hence, no pretrained models are used. All parts of
the model are described in detail in sections 5.1 and 5.2.
5.1 U-net
The U-net architecture contains an encoder and a decoder. The appearance for water
could be variable in different images, even if they are unchanged at semantic level. Given
that semantic space is favorable to change detection, the projection to of the original
image into it a good step. The U-net encoder applies this procedure. The encoder of
the network consists of multiple convolution layers, max-pooling layers, and dropout
layers. The outputs are semantic features from the images. The decoder takes these
features to recover the spatial information and produce the flood map. Deconvolution
layers, up-sampling layers, dropout layers and convolution layers are used to form the
decoder.
Figure 5 shows the base U-net architecture used in this study for all flood map-
ping models. The input and output dimensionality is 256x256x3 in accordance with
the datasets mentioned in the previous section. The hidden layers use of the ReLU
activation function, and Sigmoid activation function is employed in the output layer.
All convolution operations in the model have a kernel size of 3x3 and a zero padding.
Each max-pooling layer and concatenation layer was followed by a dropout layer. All
deconvolution operations in the model use a kernel size of 3x3, a stride of 2, and padding
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 17
of 0.
The encoder of the architecture has five levels. The first four levels have two convo-
lution operations applied to the input image, followed by a max-pooling operation with
a kernel size of 2. At the bridge, or fifth level, two convolution operations are applied,
but no max-pooling operations is used. The decoder has also four levels, each consisting
of one deconvolution operation, followed by two convolution operations. The final level
of the decoder had a third convolution operation in order to get a one channel output
as the flood map. The number of kernels at the first level of the encode is 16, and the
number is doubled at each subsequent level. The number of kernels at the fifth level
is 256. The opposite operation was done for the decoder. The number of filters was
halved until it reached 16 again, before the final convolution was applied. The output
is a flood map of size 256x256x1.
5.2 Networks
For flood mapping, an image pair is used to map the inundation by mapping water
bodies from the pre flood image and subtracting it from the water bodies map of the
post flood image. Therefore, the SNN introduced in the Second Chapter is used as the
backbone model for flood mapping in SAR data and optical data. The overall structure
is shown in figure 7, where the SNN is composed of two U-nets performing semantic
segmentation on an image pair and a differentiator is used to calculate the output
of the flood mapping model. It only subtract the tensors given by the two semantic
segmentation networks. The structures of the U-nets are identical and both branches
share the same weights due to the constraint that two images of each pair must be
projected to the same semantic feature space to be compared.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 18
Figure 7: Overall network
6 Experimental Settings
One of the major challenges with using DNNs for remote sensing applications is the
availability of data. As stated earlier, labeled Sentinel 1 data is hard to find because
attempts to use SAR data in deep learning only started recently. Another challenge is
the faced difficulty to find Sentinel 2 data covering a flooded area without clouds.
The first considered flood events for this study were located in Europe, because some
selected flood maps are made available to users through the Copernicus Emergency
Management Service [1]. The service was not available for use by the public, and only
authorized users were able to trigger the process of producing flood maps for the desired
region. Thus, only the available flood maps accessible to the public could be checked.
However, when we verified Sentinel 2 products on the corresponding dates used by the
inundation mapping service for different locations, the found images were mainly covered
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 19
with clouds which prohibited the sensor from imaging the region surface. Therefore, the
overall process of acquiring SAR and optical data, processing it, and producing ground
truth maps was done by us.
Choosing the right region for the experiment was not a task to be underestimated.
The ground truth map would be mainly deduced by labeling the optical data because
it is easier and practical. Hence, the sensing times of both sensors of the interest area
before and after the flood should be very close to each other to assume that the ground
truth is usable for both, and not only SAR data. In addition, the extent of the flood
regions, the evolution of the inundation with time, and the cost of data (only free data
were used) were factors that were considered.
After doing some research, two regions in Iran were selected for this study. Specif-
ically, the region of Aq-Qala and the region of Ahwaz. The first region was inundated
by the Golestan floods hitting northeast the country, which were caused by heavy rains
from 19th of March 2019. This area is used for training and testing the proposed model
for SAR data and the proposed model for optical data. The second region is located
in southwest Iran and was hit by heavy rains too just few days after the floods in the
north. It is used only for the network using optical data. Inundations in both regions
were observed for longuer than 20 days. Figure 8 shows a picture of the floods taken in
Khuzestan province where Ahwaz is located.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 20
Figure 8: An aerial view of flooding in Khuzestan province, Iran, April 5, 2019
Both regions have urban and non-urban areas, and the floods submerged part of the
cities, plain field, and rivers on urban and non-urban areas. Therefore, many cases were
represented in the study regions.
Given the described constraints, SAR and optical data were obtained from Sentinel
Scientific Data Hub [4]. Table 4 details the selected dates of satellite sensing.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 21
Table 4: Sentinel 2 Bands Wavelengths
Data
Satellite Bands Spatial Date Study Area
Source
Used Resolution
(m)
Sentinel 1
Dual 5x20 (SLC) 27.02.2019 Aq-Qala [4]
A
polarized C
band (VH,
VV)
Sentinel 1
Dual 5x20 (SLC) 11.03.2019 Aq-Qala [4]
A
polarized C
band (VH,
VV)
Sentinel 1
Dual 5x20 (SLC) 04.04.2019 Aq-Qala [4]
A
polarized C
band (VH,
VV)
Sentinel 2
G, NIR, 10,20 11.03.2019 Aq-Qala [4]
L2A
SWIR2
Sentinel 2
G, NIR, 10,20 05.04.2019 Aq-Qala [4]
L2A
SWIR2
Sentinel 2
G, NIR, 10,20 22.03.2019 Ahwaz [4]
L2A
SWIR2
Sentinel 2
G, NIR, 10,20 26.04.2019 Ahwaz [4]
L2A
SWIR2
For, the first region (Aq-Qala), Sentinel 2 data was obtained for two dates, one
before and one after the event. The same process was done for the second region. As
for Sentinel 1 data that was used only for the first region, data was for two different
periods of time before the inundation and one date after the event. The reason is that
the calculation of the coherence for SAR data requires two dates, so two pre event dates
were used to check the coherence before the flood and two dates before and after the
flood were used to calculate the coherence after the happening of the flood.
The first phase of pre-processing of the data is done in 3 main procedures: First,
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 22
optical data is transformed to 16bit georeferenced stack of bands 3, 8, and 12 by ex-
porting the desired bands as Geotiff files in ESA SNAP tool. The data are level 2 A
products, so the bands are orthophotos in UTM/WGS84 projection are radiometrically
calibrated. Second, SAR data is processed in ESA SNAP tool to calculate the coher-
ences [7] and the sigma nought values (in dB) using pixel values from digital numbers.
All the final data from SAR will also be geocoded. Finally, optical and SAR data are
used to produce a flood map as the ground truth using statistical methods and manual
labeling in QGIS and ENVI software products.
Figure 9 shows the stack of the NIR, Green, and SWIR bands (displayed respectively
in RGB) for data before and after the flood in Aq-Qala region. Figure 10 shows similarly
the stack of bands for Ahwaz.
Figure 9: Aq-Qala optical data. (Left: R, NIR, SWIR composite of 11.03.2019. Right:
R, NIR, SWIR composite of 05.04.2019)
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 23
Figure 10: Ahwaz optical data. (Left: R, NIR, SWIR composite of 22.03.2019. Right:
R, NIR, SWIR composite of 26.04.2019)
The Calculation of the coherence is the first step that is part of SAR data prepro-
cessing, as it will be one of the three channels fed to the model for change detection.
It’s computation data is done in the following steps in order using SNAP tool:
• Data from two different time stamps are coregistred in the software. It is possible
to do this operation with only one swath from each input. After checking the data
from Sentinel 1 and Sentinel 2, sub-swath 1 was chosen as it covers the intersection
area between all products. Therefore, only one part of the data is clipped and
used for further processing.
• The data acquired with IW mode is tiled as swaths and bursts. So coherence is
debursted to link all the bursts into one image.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 24
• Terrain Correction is applied to the output without noise
The mentioned steps above are for the data of 11th of March and 27th of February
to get the coherence before the flood event and it is similarly done for the data of 11 of
March and 4th of April to get the coherence after the flood event. As seen in table hh,
SAR data is obtained with two polarizations. Thus, two intensities will be calculated
for both the 11 of March (pre flood) and the 4th of April (post flood). The process of
calculating one intensity is using the following steps in order:
• The data is clipped to the area used in the first the step of the coherence compu-
tation for both dates.
• The output of the first step is used to calculate the sigma nought value σ0 of the
pixels to apply radiometric calibration.
• The output from the second step is debursted to link all bursts into one image
Hence, the processes for computing the coherences and intensities of SAR data result
in 6 one-band images: 2 coherence images for before and after the flood and 4 intensities
in VH and VV polarisations for pre and post flood. The final image pairs used by our
proposed model will be stacked as [coherence, intensity VH, intensity VV].
Figure 11 shows the resulting SAR images that will be used later in our algorithm.
The flood area can be visually seen when comparing both images.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 25
Figure 11: Aq-Qala SAR data. (Left: coherence, intensity VH, intensity VV composite
of 11.03.2019. Right: coherence, intensity VH, intensity VV composite of 04.04.2019)
The products from all sensors need to be georeferenced. When imported to QGIS
software, all optical and SAR images are projected to the same coordinates system. The
intersection of all images ends with all images being clipped to the following rectangle
with the coordinates in EPSG:32640 - WGS 84 / UTM zone 40N:
For the optical data used only by the algorithm for optical data flood detection, its
bounding box is has the following coordinates projected in EPSG:32639 - WGS 84 /
UTM zone 39N:
[12] evaluation of NDWI and MDWI for the assessment of waterlogging concluded
that MNDWI showed a high accuracy in achieving the task since it scored positive values
for water features mixed with vegetation and in flooded build up features. Therefore,
MNDWI can be used as part of the process of labeling the floods in the study area to
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 26
get the flood map. Also, from figure 11, the flood shape can be visualized in the SAR
data with dark pixels, thus it will be used too to produce the ground truth map. Using
SNAP and Matlab, MNDWI is computed and after apply a threshold of 0,67 on the
index, a first version of the flood mask is obtained. Some parts of the flood in the city
of Aq-Qala cound not be included using this threshold, therefore ENVI software was
used to manually labels those areas using optical bands and SAR bands visualizations.
The final flood map is displayed in figure 12.
The same process was applied for the data of Ahwaz region. Because there was no
SAR data for this region, a ground truth map was obtained later from Copernicus that
gave flood maps for some intersection regions after checking its accuracy using flood
area maps produced by official authorities. The two maps were compared and the label
map obtained is shown in figure 13.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 27
Figure 13: Final flood mask of Ahwaz region
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 28
6.1.5 Data Augmentation
It is important to note that data augmentation is done while considering the bal-
ancing of data labels. In other words, the final optical and SAR datasets should have
balanced flood and non-flood pixels numbers balanced. First, after computing the num-
ber of labels in each class, the results show that non-flood pixels are outnumbering
flood pixel by a big number. Therefore, after tiling the images into 256 by 256 with
3 channels, the patches having a large flood area are augmented until the classes are
more or less balanced. Finally, the data is shuffled and split into training, validation
and testing sets with the percentages 64%, 16%, 20% respectively.
The confusion matrix is the evaluation measure used for assessing flood detection
performance. Table 5 shows the confusion matrix used in this experiment.
The matrix has four simple criteria based on it that measure how the predicted
values are close to the true values:
• The accuracy A that calculates the accuracy of correctly labeled pixels in the total
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 29
number of pixels. Its mathematical expression is:
(T P + T N )
A= (6)
(T P + T N + F P + F N )
• The recall R that determines how many flood pixel are correctly classified. Its
mathematical expression is:
TP
A= (7)
(T P + F N )
• The precision calculates the fraction of pixels correctly labeled as flood pixel. Its
mathematical expression is:
TP
A= (8)
(T P + F P )
• F1 score measures the balance between the criteria R and P. Its mathematical
expression is:
2∗R∗P
A= (9)
(R + P )
Pixel accuracy is an evaluation method where the percent of pixels in the image
which were correctly predicted is calculated. This metric can sometimes provide mis-
leading results when there is no balance in class representations on the dataset, because
the method is mainly reporting how well the class with the highest representation is
predicted. Given that class balance was verified in both datasets, this evaluation criteria
can be used for evaluation.
7 Experimental Results
This section discusses the methodology used to conduct the study. Section IV-A
lists the training algorithm parameters, and Section IV-B presents the performance of
both models.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 30
7.1 Training Algorithm Parameters
To train and test the networks, tensorflow platform is used. The input data images
have a common size which is 256x256x3. For the model based on optical data, a total
of 1820 pairs of images were used. These pairs include data from the first region and
the second region of study. The data was split into training, validation and testing sets
with the percentages 64%, 16%, 20% respectively. As for the model based on SAR data,
only 304 pairs of images were used as SAR data was available for the first area of study
only. The splitting of data into training, validation and testing sets was done in the
same way as it was done for the optical data.
The model parameters are initialized with a normal distribution centered at 0. The
optimization of the parameters is done by using the binary cross-entropy function to
compare the predicted values with the ground truth. The parameters are then optimized
using Adam technique [9]. The learning rate is initially set to 0.00001 and it is reduced
by a factor of 0,1 when the validation accuracy is not increasing after 3 epochs. The
SNN for flood mapping using optical data was trained for 33 epochs and figure 14 shows
the accuracy and loss results for the training phase.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 31
Figure 14: Optical data based network training and validating accuracies and losses
(learning) curves
The SNN for flood mapping using SAR data was trained for 40 epochs and figure
15 shows the accuracy and loss results for the training phase.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 32
Figure 15: SAR data based network training and validating accuracies and losses (learn-
ing) curves
For both networks, the best model was chosen with consideration to the lowest
validation loss value.
Table 6 summarizes the performance of the 2 models based on the evaluation metrics.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 33
The flood detection using optical data performed quite well according to the eval-
uation factors. The detection of non-flood areas did better that the detection of flood
areas, but considering the small amount of training data, the results can be considered
as acceptable. Figures 16 - 19 show some results of flood maps. On each figure, the
left plot is the NIR (second channel) of the pre-flood image, the second-to-left plot is
the second channel of the post-flood image, the central plot is the ground truth map,
the fourth plot is the flood prediction map before applying the threshold and the last
plot (right plot) is the final predicted flood map using the threshold. On all plots, the
borders of the flood areas are included to facilitate the visualization of the flood area.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 34
Figure 19: Flood detection results
As for SAR based flood detection model, the results were not better. The model also
performed better at predicting non-flood pixels, but given the small amount of data,
the model performed better than expected. Figures 20 - 22 show some results of flood
maps. On each figure, the left plot is the pre-flood image, the second-to-left plot is the
post-flood image, the central plot is the ground truth map, the fourth plot is the flood
prediction map before applying the threshold and the last plot (right plot) is the final
predicted flood map using the threshold. On all plots, the borders of the flood areas
are included to facilitate the visualization of the flood area.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 35
Figure 22: Flood detection results
7.3 Discussion
The process of comparing flood detection models using different remote sensing data
is complex task because it requires an understanding of the complicated relationships
between the flood and the satellite characteristics. Throughout this experiment, the
following factors are considered to be key parameters in this task:
• Weather conditions are important as they influence the results of optical data.
Clouds did affect optical data availability in the northwest of Aq-Qala region.
SAR was used to check cloudy areas in Aq-Qala, but these regions were not
affected by floods.
• Noise of SAR data is another factor that affected the performance of flood map-
ping.
Given that U-nets proved to be efficient to use for cases where data is not abundantly
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 36
available, future work include training a single U-net only for sematic segmentation of
individual images, and then use this trained network in a Siamese network to do flood
detection for other similar datasets.
8 Conclusion
This thesis presented two models for flood detection. The first model is using optical
satellite data from Sentinel 2 and the second model uses SAR satellite data from Sentinel
1. We showed how a Siamese network was able to learn segmentation of SAR and optical
satellite data from the proposed methods. The accuracy of the first model for detecting
flood pixels in images reached 81% and a 94,67% percent of f1 score, which can be
assumed as very satisfying results given the small number of data the network trained
on and the various cases of water reflectance included in the dataset. The second model
accuracy in The purpose was to make costless and fast algorithms for flood detection
using different satellite sensor modalities. Future work include the use of other satellites
to compare results with Sentinel 1 and Sentinel 2 data. Also, a network that combines
both optical and SAR data would be interesting to check its effectiveness compared to
a uni-sensor modal network. Additional efforts can be made to eliminates the task of
choosing a threshold after differentiating the segmentation maps of the Siamese sub-
networks.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 37
References
[1] Copernicus emergency management service (
c 2019 european union), emsr352-
floods in iran.
[2] Sentinel- 1.
[3] Sentinel- 2.
[6] Yun Du, Yihang Zhang, Feng Ling, Qunming Wang, Wenbo Li, and Xiaodong Li.
Water bodies’ mapping from sentinel-2 imagery with modified normalized difference
water index at 10-m spatial resolution produced by sharpening the swir band.
Remote Sensing, 8(4):1–19, 2016.
[7] Giorgio Franceschetti and Riccardo Lanari. Synthetic aperture radar processing.
CRC Press, 1999.
[9] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization.
International Conference on Learning Representations, 12 2014.
[10] E.e. Kuruoglu and J. Zerubia. Modeling sar images with a generalization of the
rayleigh distribution. IEEE Transactions on Image Processing, 13(4):527–533,
2004.
[11] Hyeonwoo Noh, Seunghoon Hong, and Bohyung Han. Learning deconvolution net-
work for semantic segmentation. 2015 IEEE International Conference on Computer
Vision (ICCV), page 1520–1528, 2015.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 38
[12] Igor Ogashawara, Marcelo Curtarelli, and Celso Ferreira. The use of optical remote
sensing for mapping flooded areas. International Journal of Engineering Research
and Application, 3:1956–1960, 10 2013.
[14] Filip Radenovic, Giorgos Tolias, and Ondrej Chum. Fine-tuning cnn image retrieval
with no human annotation. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 41(7):1655–1668, Jan 2019.
[15] Waseem Rawat and Zenghui Wang. Deep convolutional neural networks for image
classification: A comprehensive review. Neural Computation, 29(9):2352–2449,
2017.
[16] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional net-
works for biomedical image segmentation. Lecture Notes in Computer Science
Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015,
page 234–241, 2015.
[17] Sashikant Sahoo, Raj Setia, Shashikanta Sahoo, Avinash Prasad, and Brijendra
Pateriya. Evaluation of ndwi and mndwi for assessment of waterlogging by inte-
grating digital elevation model and groundwater level. Geocarto International, 30,
07 2015.
[18] Ashbindu Singh. Review article digital change detection techniques using remotely-
sensed data. International Journal of Remote Sensing, 10(6):989–1003, 1989.
[19] Simon Stent, Riccardo Gherardi, Björn Stenger, and Roberto Cipolla. Detect-
ing change for multi-view, long-term surface inspection. Procedings of the British
Machine Vision Conference 2015, page 127–1, 2015.
[20] Hanqiu Xu. Modification of normalised difference water index (ndwi) to enhance
open water features in remotely sensed imagery. International Journal of Remote
Sensing, 27(14):3025–3033, 2006.
Flood Detection with a Deep Learning Approach Using Optical and SAR Satellite Data Page 39