Sensors: Learning To Detect Cracks On Damaged Concrete Surfaces Using Two-Branched Convolutional Neural Network
Sensors: Learning To Detect Cracks On Damaged Concrete Surfaces Using Two-Branched Convolutional Neural Network
Sensors: Learning To Detect Cracks On Damaged Concrete Surfaces Using Two-Branched Convolutional Neural Network
Article
Learning to Detect Cracks on Damaged Concrete
Surfaces Using Two-Branched Convolutional
Neural Network
Jieun Lee 1 , Hee-Sun Kim 2 , Nayoung Kim 1 , Eun-Mi Ryu 2 and Je-Won Kang 1, *
1 Department of Electrical and Electronic Engineering, Ewha Womans University, Seoul 03760, Korea;
leeje2993@gmail.com (J.L.); 12skdud21@naver.com (N.K.)
2 Department of Architectural and Urban Systems Engineering, Ewha Womans University, Seoul 03760, Korea;
heesun531@gmail.com (H.-S.K.); gogo5423@nate.com (E.-M.R.)
* Correspondence: jewonk@ewha.ac.kr
Received: 10 September 2019; Accepted: 1 November 2019; Published: 4 November 2019
Abstract: Image sensors are widely used for detecting cracks on concrete surfaces to help proactive and
timely management of concrete structures. However, it is a challenging task to reliably detect cracks
on damaged surfaces in the real world due to noise and undesired artifacts. In this paper, we propose
an autonomous crack detection algorithm based on convolutional neural network (CNN) to solve the
problem. To this aim, the proposed algorithm uses a two-branched CNN architecture, consisting
of sub-networks named a crack-component-aware (CCA) network and a crack-region-aware (CRA)
network. The CCA network is to learn gradient component regarding cracks, and the CRA network is
to learn a region-of-interest by distinguishing critical cracks and noise such as scratches. Specifically,
the two sub-networks are built on convolution-deconvolution CNN architectures, but also they
are comprised of different functional components to achieve their own goals efficiently. The two
sub-networks are trained in an end-to-end to jointly optimize parameters and produce the final
output of localizing important cracks. Various crack image samples and learning methods are used
for efficiently training the proposed network. In the experimental results, the proposed algorithm
provides better performance in the crack detection than the conventional algorithms.
Keywords: deep learning; crack detection; convolutional neural network; edge detection;
fire-damaged concrete; image processing
1. Introduction
Crack information can help timely and proactive management of concrete structures, and image
sensors are economically useful to detect the cracks on concrete surface as compared to other
sensors [1–4]. However, the conventional process of the visual inspection is too time-consuming
since it needs manual tracing of cracks on the surface image. Crack detection algorithms can perform
quantitative analysis on the strengths or lengths of edges to estimate a degree of safety. In practice,
such autonomous crack assessment is effectively used for safety diagnosis of concrete structures such
as bridges [1], nuclear plants [2], pavements [3], and tunnels [4] through image sensors.
The main challenge in crack detection is to identify only the important cracks whose widths and
lengths are greater than some thresholds, specified by a safety instruction [5]. Earlier crack detection
algorithms use edge detection and morphological image processing algorithms such as Canny detector,
Sobel mask, and Laplacian of Gaussian (LoG) [6–8]. However, many noises or other tiny pores and
scratches on the surfaces make cracks difficult to be detected in the real world. The task is even
more challenging when the surfaces of concretes are damaged by various factors [9–14]. For instance,
Figure
Figure 1. 1.
(a)(a)
TheThe originalconcrete
original concretesurface,
surface, (b) the
the crack
crackmap
mapmanually
manuallytraced byby
traced anan
expert, (c) (c)
expert, edgeedge
detection
detection byby Sobel
Sobel mask,and
mask, and(d)
(d)by
byHolistic
Holistic Edge
Edge Detection
Detection(HED)
(HED)[9].
[9].
In this paper,
Recently, we propose
Convolutional an autonomous
Neural Network (CNN) and reliable crack actively
has been detectionapplied
algorithm to using
various CNN, image
extendedand
processing from the preliminary
understanding work [12].
problems Even
such as though a domain
edge detection expert
[9], can easily
saliency identify
detection [10],critical
semantic
cracks that can
segmentation andhave significant
recognition impact
[11]. TheonCNN the safety
uses evaluation
automatic of a concrete surface,
hierarchical feature it is a much
learning in an
more difficult
end-to-end mannertasktofor an autonomous
allow for understandingsystemdifferent
due to undesired
contexts in artifacts
an image.on the damaged
Holistic Edgeconcrete
Detection
surfaces
(HED) [15-17]. Toasolve
[9] develops the problem,
CNN-based edgethe proposedsystem,
detection algorithm is designed
combining for localizing
multi-scale andimportant
multi-level
cracks based on the recent advances in deep learning research. Our
visual responses in convolution layers. Deep Contour-Aware Network (DCAN) [11] proposes previous work focuses on safety to use
evaluation of fire-damaged concretes by showing the correlation between
multi-level contextual features to accurately detect contours and separate clustered objects. In [10], the lengths of cracks,
durations, and temperatures and structural performance. However, in this work, we rather show
Pixel-wise Contextual Attention Network (PiCANet) is proposed to detect important or salient regions
more thorough ideas on autonomous crack detection using deep learning. Experimental results
in an image.
conducted with various crack datasets show the proposed algorithm provides more accurate and
In this paper, we propose an autonomous and reliable crack detection algorithm using CNN,
reliable performance in crack detection compared to previous works.
extended from the preliminary work [12]. Even though a domain expert can easily identify critical
Our contribution in this paper is as follows. We use a two-branched CNN architecture to
cracks
efficientlycan
that have significant
distinguish the relevant impact
crack on andthe
the safety evaluation such
other components of a asconcrete
noise andsurface, it isimage
edge-like a much
more difficult task
components on theforconcrete
an autonomous
surfaces. system due tobehind
The intuition undesired artifacts on
the proposed the damaged
model concrete
is to use noise-
surfaces [15–17]. To solve the problem, the proposed algorithm is designed
suppression and region detection, inspired by old wisdom on conventional edge detection methods for localizing important
and multi-channel
cracks based on thenetwork architecture
recent advances in[18,19]. Specifically,
deep learning a branchOur
research. of the proposedwork
previous network is to on
focuses
detect
safety edge or contours
evaluation that are concretes
of fire-damaged consideredbyasshowing
the mostthe prominent
correlation components
between the in cracks,
lengthsand the
of cracks,
other branch
durations, and istemperatures
to identify a region-of-interest as in semantic segmentation.
and structural performance. However, in The thisfeatures
work, we learned
rather fromshow
the thorough
more two different networks
ideas are combined
on autonomous for identifying
crack the important
detection using cracks. Data
deep learning. acquisition and
Experimental results
training with
conducted strategy are important
various crack datasetsto overcome
show thean over-fitting
proposed problem
algorithm in deep
provides morelearning
accurate andand
appropriately
reliable performancevalidate the performance.
in crack Therefore,to
detection compared toprevious
facilitate learning,
works. we conduct fire experiments
forOur
ourselves to obtain more crack image samples in the
contribution in this paper is as follows. We use a two-branchedreal world. It is noted that the fire-damaged
CNN architecture to efficiently
concretes show many detailed cracks with combustion, so the reliable
distinguish the relevant crack and the other components such as noise and edge-like image crack detection algorithm
components is
mattered. Furthermore, the proposed algorithm uses skip connections
on the concrete surfaces. The intuition behind the proposed model is to use noise-suppression and made with convolution and
deconvolution operations that can transfer the crack features trained in the lower layers to the higher
region detection, inspired by old wisdom on conventional edge detection methods and multi-channel
layers on top of the U-net architecture [20].
network architecture [18,19]. Specifically, a branch of the proposed network is to detect edge or
contours that are considered as the most prominent components in cracks, and the other branch is to
identify a region-of-interest as in semantic segmentation. The features learned from the two different
networks are combined for identifying the important cracks. Data acquisition and training strategy
are important to overcome an over-fitting problem in deep learning and appropriately validate the
performance. Therefore, to facilitate learning, we conduct fire experiments for ourselves to obtain
more crack image samples in the real world. It is noted that the fire-damaged concretes show many
detailed cracks with combustion, so the reliable crack detection algorithm is mattered. Furthermore,
the proposed algorithm uses skip connections made with convolution and deconvolution operations
that can transfer the crack features trained in the lower layers to the higher layers on top of the U-net
architecture [20].
Sensors 2019, 19, 4796 3 of 18
The rest of the paper is organized as follows. In Section 2, we review the previous studies.
In Section 3, we explain the proposed method. In Section 4, we show the proposed training strategy
and data acquisition. Experimental results are shown in Section 5. We conclude with remarks in
Section 6.
2. Related Works
Figure U-netstructure
2. U-net
Figure 2: structure[20].
[20].
It Itisisemphasized
emphasized thatthat our
ourwork
work differs
differs fromfrom previous
previous works.works.
Our workOuris work is different
different to the
to the recent
CNN-based
recent CNN-based edge or contour
edge detection
or contour algorithms
detection [9,11,20] [9,11,20]
algorithms as the proposed techniquetechnique
as the proposed has noise- has
suppression or region
noise-suppression detection
or region networks
detection to localize
networks only the
to localize critical
only the cracks.
criticalFurthermore, our work our
cracks. Furthermore,
is distinguished
work is distinguished fromfrom
the recent crackcrack
the recent detection algorithms
detection usingusing
algorithms deep deep
learning [2,3,14–17,
learning 33] as it as
[2,3,14–17,33]
generates aa pixel-wise
it generates pixel-wisecrack
crackmap
mapfrom
fromthethecombination
combination of of
twotwodifferent sub-networks,
different sub-networks, i.e.,i.e.,
oneone
for for
detectingthe
detecting thecrack
crackcomponents
components and and the
the other
other for
for detecting
detectingthe thecrack
crackregions.
regions.The
Theprevious
previous works
works
focus on supervised learning for a classification problem in crack
focus on supervised learning for a classification problem in crack detection. detection.
3. 3. ProposedCrack
Proposed CrackDetection
Detection Network
Network
3.1.3.1. Motivation
Motivation
ForForanan efficient
efficient crackdetection,
crack detection,aatrade-off
trade-offbetween
betweennoise
noise suppression
suppression and
and localization
localizationneeds
needstoto be
be considered. In other words, the detector may be able to find the precise location of
considered. In other words, the detector may be able to find the precise location of the crack, but the crack, but the
the effects of noise increase and vice versa. The problem is challenging especially in crack
effects of noise increase and vice versa. The problem is challenging especially in crack images as they images as are
they are captured in the wild and suffer from noise. To solve the problem, we propose
captured in the wild and suffer from noise. To solve the problem, we propose a two-branched networka two-branched
network architecture as shown in Figure 3, consisting of sub-networks named a crack-component-
architecture as shown in Figure 3, consisting of sub-networks named a crack-component-aware (CCA)
aware (CCA) network and a crack-region-aware (CRA) network. In one hand, the CCA network is to
network and a crack-region-aware (CRA) network. In one hand, the CCA network is to find a low-level
find a low-level image feature regarding the crack, e.g., the representation of the anisotropy property
image feature regarding the crack, e.g., the representation of the anisotropy property of the crack as
of the crack as the crack has many edge-like image features. On the other hand, the CRA network is
the crack has many edge-like image features. On the other hand, the CRA network is to approximate
to approximate the region-of-interests by distinguishing crack and non-crack regions. In the CRA
thenetwork,
region-of-interests by distinguishing
a higher weight is assigned to acrack
regionand non-crack
closer regions.
to the crack In the
and vice CRA
versa. Bynetwork,
combining a the
higher
weight
outputsis assigned
of the twoto asub-networks,
region closer to
thethe crack and
proposed vice versa.
network By combining
can suppress small the
noiseoutputs of the two
in non-crack
regions that have been detected in the CCA network to improve the accuracy.
sub-networks, the proposed network can suppress small noise in non-crack regions that have been
detected in the CCA network to improve the accuracy.
Sensors 2019, 19, x FOR PEER REVIEW 5 of 20
Sensors 2019, 19, 4796 5 of 18
Figure3.3.Proposed
Figure Proposedtwo
two stream CNN
CNN architecture.
architecture.
3.2.3.2.
Architecture Description
Architecture Description
3.2.1. Crack-Component-Aware
3.2.1. Crack-Component-AwareNetwork
Network
TheThe
CCACCA network
network consists
consistsofofseven
sevenconvolution
convolution layers andthree
layers and threedeconvolution
deconvolution layers
layers as as shown
shown
in Table
in Table 1. We 1. connect
We connect the convolution
the convolution layerlayer todeconvolution
to the the deconvolution
layerlayer to effectively
to effectively deliver
deliver the
the trained
trained
features infeatures
the lowerin the lower
layers tolayers to the higher
the higher layers,layers, motivated
motivated bysymmetric
by the the symmetric U-net
U-net structure[20]
structure
and[20]
theand
skipthe skip connection.
connection. The skipTheconnection
skip connection is known
is known to improve
to improve overalloverall performance
performance withwith
slight
slight increments
increments of computational
of computational complexitycomplexity
[19]. [19].
Motivated by the work in our algorithm, the architecture is designed to combine the output
feature map ofTable
a convolution layer with
1. Implementation that
details of crack-component-aware
of the the corresponding deconvolution
network. layer at the
symmetric position. The skip connection is made using internal convolution and deconvolution
LayerTo be specific,
operations in our algorithm. Kernel asSize Stride
zoomed Feature
in Figure 3, we Map
apply a convolutions layer
using kernel size 2 and stride 2 for the internal convolution operation in the skip connection, so we
Conv1_1
3 1 512 × 512 × 64
have the half size of Conv1_2
feature map as 𝐶 × × . For the deconvolution operation in the skip
connection, we use a deconvolution
Pool1 layer 2using kernel
2 size 8 256
and×stride
256 × 264to recover the original
feature size 𝐶 × 𝑊 ×Conv2_1
𝐻 . Afterwards, they are concatenated with the existing layers in the
3 1 256 × 256 × 128
deconvolution layers ofConv2_2
the CCA. By doing so, the contexts trained in the convolution layer cannot
be missed and maintain Pool2
the important characteristics
2 of
2 real cracks.
128 ×We
128 also
× 128show the ablation tests
turning on and off the convolution and deconvolution layers of the skip connections in testing
Conv3_1
procedures. When turning on the operation,
Conv3_2 3 the network
1 can128
focus
× 128on the actual cracks. In the
× 256
opposite case, it shows Conv3_3
other noise components when turning off the function, as shown in Figure 4
Pool3 2 2 64 × 64 × 256
Deconv1 4 2 128 × 128 × 128
Deconv2 4 2 256 × 256 × 64
Deconv3 4 2 512 × 512 × 32
1×1 Conv 1 1 512 × 512 × 1
Cross-entropy 1 1 512 × 512 × 1
Motivated by the work in our algorithm, the architecture is designed to combine the output
feature map of a convolution layer with that of the corresponding deconvolution layer at the symmetric
position. The skip connection is made using internal convolution and deconvolution operations in our
algorithm. To be specific, as zoomed in Figure 3, we apply a convolutions layer using kernel size 2
and stride 2 for the internal convolution operation in the skip connection, so we have the half size
Sensors 2019, 19, 4796 6 of 18
Wf Hf
of feature map as Cf × 2 × 2 . For the deconvolution operation in the skip connection, we use a
deconvolution layer using kernel size 8 and stride 2 to recover the original feature size Cf × W f × H f .
Afterwards, they are concatenated with the existing layers in the deconvolution layers of the CCA.
By doing so, the contexts trained in the convolution layer cannot be missed and maintain the important
characteristics of real cracks. We also show the ablation tests turning on and off the convolution and
deconvolution layers of the skip connections in testing procedures. When turning on the operation,
Sensors 2019, 19, x FOR PEER REVIEW 6 of 20
the network can focus on the actual cracks. In the opposite case, it shows other noise components
when turning off the function, as shown in Figure 4.
Figure 4. Ablation tests turning on and off the convolution and deconvolution in skip connections.
Figure 4. Ablation tests turning on and off the convolution and deconvolution in skip connections.
The skip connections are located at the convolution layers before pooling layers. The kernel size in
The skip connections are located at the convolution layers before pooling layers. The kernel size
the convolution layer is set to 3. After the deconvolution layers, the CCA network predicts a map that
in the convolution layer is set to 3. After the deconvolution layers, the CCA network predicts a map
localizes the ground truth of a crack map. We also add a max-pooling layer to every 2 or 3 convolution
that localizes the ground truth of a crack map. We also add a max-pooling layer to every 2 or 3
layers to reduce
convolution the number
layers to reduceofthe
parameters
number ofand prevent and
parameters an overfitting
prevent an problem.
overfitting problem.
The
3.2.3.features trained
Combination of CCAin CCA
and and
CRACRA are the same before pooling layer 3 in Tables 1 and 2, but
the feature maps that have passed through the CCA and CRA networks are further computed with
The features trained in CCA and CRA are the same before pooling layer 3 in Tables 1 and 2, but
element-wise multiplication and additional one 1 × 1 convolution layer to precisely output the map of
the feature maps that have passed through the CCA and CRA networks are further computed with
cracks, as shown multiplication
element-wise in the final output of Figure
and additional one5.1Figure 5 showslayer
× 1 convolution examples of input
to precisely images
output the mapand the
ground truth and the intermediate outputs of the CCA and the CRA and the final output
of cracks, as shown in the final output of Figure 5. Figure 5 shows examples of input images and the from the left
to theground truthoutput
right. The and thevalues
intermediate outputs
of the CRA areofadjusted
the CCAto and thethe
have CRA and the final
maximum output
pixel valuefrom thein the
of 255
left tofor
examples thethe
right. The output
purpose values
of visual of the CRA are
comparisons, adjusted
while theytoarehave the maximum
actually pixel
filled with value
some of 255
small values.
in the examples
The features trainedforin the
CCA purpose of visual
capture comparisons,
detailed while theysurfaces,
edges in concrete are actually
butfilled with some
the results cansmall
include a
lot ofvalues.
edges,The features
blobs, trainedMeanwhile,
and lines. in CCA capturethedetailed edges extract
CRA cannot in concrete surfaces,
detailed but theas
patterns results
shown canin the
include a lot of edges, blobs, and lines. Meanwhile, the CRA cannot extract detailed patterns as shown
fourth column, while being capable of determining whether the region is important or not. When
in the fourth column, while being capable of determining whether the region is important or not.
combining the results, the crack is appropriately extracted in the output. For instance, the input in
When combining the results, the crack is appropriately extracted in the output. For instance, the input
the second row has
in the second rownone
hasof important
none cracks,
of important and and
cracks, accordingly, thethe
accordingly, output
output map
mapbecomes
becomes empty
empty even
though the CCA points several low-level edge features in the
even though the CCA points several low-level edge features in the image. image.
Figure
Figure 5. Examples
5. Examples of input
of input images,the
images, theground
ground truth,
truth, the
theCCA
CCAoutput,
output,thethe
CRA output,
CRA and the
output, andfinal
the final
output
output (from(from leftright).
left to to right).
4. Training
4. Training
4.1. Crack
4.1. Crack DataData Acquisition
Acquisition
Sensors 2019, 19, x FOR PEER REVIEW 8 of 20
It is important
It is important to use toause
largea large number
number of data
of data samples
samples to train
to train the deep
the deep neural
neural network
network andandvalidate
validate the performance.
the performance. We show how We toshow how
acquire to acquire
crack crack image
image samples andsamples
manageandthemmanage them for
for training. We use
training. We use three crack databases, i.e., Fire Crack Dataset (FCD), CrackForest Dataset (CFD) [34],[35].
three crack databases, i.e., Fire Crack Dataset (FCD), CrackForest Dataset (CFD) [34], and AigleRN
Theand
FCD AigleRN [35].from
is obtained The fire
FCDexperiments
is obtained conducted
from fire experiments
by the authorsconducted
to haveby the authors
a sufficient sizetoofhave a
datasets.
Thesufficient
acquisitionsize of datasets.
process by The
fire acquisition
experimentsprocess by fireinexperiments
is shown is shown
Figure 6. The CFDinandFigure
the 6.AigleRN
The CFDare
and theonline.
available AigleRN are available online.
Figure
Figure 6. (a)
6. (a) Fire
Fire experimentstotoconstruct
experiments construct the
the database
database and
and(b)
(b)the
theconcrete
concretestructures after
structures thethe
after fire.
fire.
Figure7.7.Sample
Figure Sampleconcrete
concrete specimen
specimen and
andimage
imageacquisition
acquisitionsteps.
steps.
In addition, domain experts in concretes materials and construction create the ground-truth of
FCD manually. When generating the ground-truth, we define a crack whose width is larger than
0.3 mm or about three to four pixels as the critical crack, as recommended in [5]. The crack regions are
marked with pixel intensity of 255 as the brightest pixels in an image, and vice versa for non-crack
regions. It is noted that the concrete surfaces show some fire marks and paint as highlighted in the
yellow boxes of Figure 7. The image patches may worsen effectiveness of the proposed network,
but we include all the patches with no processing to be closer practical scenarios. We use no input
normalization, noise removal, nor histogram equalization in the data preparation.
Figure 8. 8.BSDS,
Figure BSDS,VOC2012,
VOC2012, and
and DRIVE datasetsfor
DRIVE datasets foraalearning
learningininthe
the proposed
proposed technique.
technique.
1X
Li = − [ y ln e
yi + (1 − y) ln(1 − e
yi )] (1)
N x
y1 and e
where e y2 are the output images of the CCA and CRA, respectively. e
y3 is the predicted output
image, given as the element-wise multiplication of the CCA and CRA.
We train the network parameter h∗ to minimize L, i.e.,
where α1 = α2 = 0.3 and α3 = 0.4. To obtain the parameter, we use the standard backpropagation
algorithm using ADAM optimizer. To be specific, the learning rate is 10−5 , and the rate is reduced to a
multiple of 0.1 every 10,000 iterations. The training is stopped after 10K iterations. The hyperparameters
are empirically obtained.
5. Experimental Result
In this section, we evaluate the performance of the proposed algorithm. All experiments are
conducted on a GPU sever with Intel 3.5GHz, 32GB memory and a GPU (Geforce GTX 1080) that
Sensors 2019, 19, 4796 10 of 18
is sourced from NVIDIA, Santa Clara, USA. We used Caffe deep learning software framework to
implement the proposed technique. We used around 3230 training samples and 135 testing samples in
FCD dataset. We combine the sample images in CFD and AigleRN datasets. The two datasets consist
of sample images of pavements, while the numbers of the samples are relatively small to train the
convolution neural network. We used 156 samples in the combined sets for the testing.
The detection performance of the proposed algorithm is evaluated with other conventional
algorithms. Specifically, we first adopt state-of-the-art crack detection algorithms based on CNN, i.e.,
Cha et al. method [14] and Kim et al. method [15]. They use block-based classification techniques
to identify whether or not a tested block of an image presents a crack. We also compare Lim et al.
method [21], recently developed for the crack detection using the Laplacian of Gaussian (LoG).
Furthermore, we employ HED network [9] and DCAN [11] for the comparisons since they are the
state-of-the-arts contour and edge detection algorithms based on CNN that can be possibly applied to
the crack detection.
Moreover, Figure 10 shows the ROC curves using CFD and AigleRN datasets. As can be seen,
the proposed algorithm also provides better detection performance than the other algorithms, while
the DCAN shows comparable performance. Quantitatively, the AUC values of the proposed algorithm
is 0.910, while those of the HED, DCAN, Lim et al., Cha et al., and Kim et al. are 0.795, 0.872, 0.867,
0.843, and 0.830, respectively.
Sensors 2019, 19, 4796 11 of 18
Figure 9. ROC curves of compared algorithms in FCD.
It is observed that some of the compared algorithms also provide fairly good performance in
It is observed that some of the compared algorithms also provide fairly good performance in the
thedetection,
detection, when using the CFD and AigleRN datasets. One reason can be the properties of the
when using the CFD and AigleRN datasets. One reason can be the properties of the two
two datasets,
datasets, whichwhich contain
contain lessinnoise
less noise in the concrete
the concrete surfaces.
surfaces. In comparison,In comparison,
the FCD datasettheposes
FCDmoredataset
poses more challenges in the detection as the surfaces are sometimes contaminated
challenges in the detection as the surfaces are sometimes contaminated by combustion, and the by combustion,
andimages displaydisplay
the images a number of smallofand
a number tinyand
small cracks,
tiny given
cracks,ingiven
the fireinexperiments. Accordingly,
the fire experiments. most
Accordingly,
of the
most algorithms
of the showshow
algorithms worse performance.
worse Particularly,
performance. the conventional
Particularly, algorithm
the conventional using edge
algorithm using
detection
edge detection suchsuch
as Lim’s method
as Lim’s method shows
showssignificantly different
significantly performance
different performancein Figures 9 and
in Figures 10. 10.
9 and
However,the
However, theperformance
performance of of the
the proposed
proposed algorithm
algorithmisisrelatively
relatively reliable
reliableregarding the the
regarding different
different
datasets
datasets ininthe
thequantitative
quantitativeresults.
results.
For more quantitative comparisons, we calculate the precision, the recall, and the F-measure.
The precision is obtained with the proportion of the crack samples to the entire samples that are
estimated to the crack. The recall is obtained with the proportion of the crack samples to the entire
samples that actually are the crack. A Fβ score is the harmonic mean of the precision and the recall. It
is mathematically given as
precision · recall
Fβ = (1 + β2 ) 2 (3)
β precision + recall
where β is set to 1 in our evaluation. The precision, the recall, the F-measure, and the AUC values
are shown in Tables 3 and 4 when using the FCD dataset and using the CFD and AigleRN datasets,
respectively. As described in the tables, the proposed algorithm yields significantly improved
performance as compared to the other algorithms.
We also compare the computational complexity of the proposed algorithm with the tested
algorithms in terms of the measurement time and memory sizes. We observed the time and the
memory when 10 input image samples are processed and measured the numbers on the average to
have robust results. In our model, the time was around 0.585 s, and the CPU and GPU memory were
around 60MB and 3.8GB, respectively. As for Lim’s method, the time was around 0.512 s and the CPU
and GPU memory around 42MB and 1.4GB, similar to Cha’s method. Both the methods are actually
developed for block classification, and the amounts of the computational complexity were smaller
than the proposed algorithm. However, the performance is somewhat degraded. As for HED and
DCAN, they have deeper convolutional layers, increasing the computational loads more, and the time
was estimated around 4.677 s and 1.077 s, respectively. In our design, we have attempted to use all
convolution layers [42] instead of using pooling layers. The performance varies with test datasets
slightly while the computational time increases. Thus, we use the pooling layers. The CPU memory
were around 410MB and 380MB, and the GPU memory around 11GB and 8.2GB, which are larger than
the proposed algorithm. They have deeper layers than the proposed algorithm.
The visual comparisons of CFD and AigleRN are given in Figure 12. In the fourth and sixth columns,
the proposed algorithm and the HED output comparable results to the ground truth. Meanwhile,
Lim et al. and DCAN detect all the background textures of the crack images, which significantly
drop the detection performance. Especially in the fourth column, DCAN fails to distinguish the
crack and the background textures. HED successfully detects solid cracks in the third and in the fifth
columns. The results are given when the backgrounds are simple. However, if the background becomes
complicated as in the first column, HED has difficulties in the detection. Furthermore, Kim et al. and
Cha et al. show robust performance against complex background textures and the performance is
Sensors 2019, 19, x FOR PEER REVIEW
comparatively higher than that of FCD. However, the recognition rate is still low because of 14
theof lower
20
precision values.
The visual comparisons of CFD and AigleRN are given in Figure 12. In the fourth and sixth
columns, the proposed algorithm and the HED output comparable results to the ground truth.
Meanwhile, Lim et al. and DCAN detect all the background textures of the crack images, which
significantly drop the detection performance. Especially in the fourth column, DCAN fails to
distinguish the crack and the background textures. HED successfully detects solid cracks in the third
and in the fifth columns. The results are given when the backgrounds are simple. However, if the
Sensors 2019, 19, x FOR PEER REVIEW 15 of 20
Sensors 2019, 19, 4796 14 of 18
InIn
Figure
Figure13,
13,wewepoint
pointoutout the
the false positiveerrors
false positive errorsandandthethefalse
false negative
negative errors
errors in the
in the proposed
proposed
algorithm,
algorithm, byby subtracting
subtracting thethe ground
ground truth
truth from
from thethe output
output images
images of the
of the proposed
proposed algorithm
algorithm to
to obtain
theobtain
error the error The
images. images.
greenTheboxes
greenrepresent
boxes represent
the falsethe false positive
positive errors, errors,
and theand redthe red represent
boxes boxes
therepresent the falseerrors.
false negative negativeFor errors. For instance,
instance, in the
in the first rowfirst
ofrow of Figure
Figure 13a,concrete
13a, the the concrete image
image hashas
some
some
holes onholes on the surface,
the surface, and it isand it is observed
observed that thethat the edges
edges of theofcavity
the cavity are mistakenly
are mistakenly detected
detected as
as cracks.
Thecracks.
secondThe second
row of therow of the13a
Figure Figure
is CCA 13a and
is CCA
CRA and CRA image
output output ofimage of the
the first first
test test image.
input input image.
Through
the CCA network, all large and small holes were detected. In the CRA, all small holes were were
Through the CCA network, all large and small holes were detected. In the CRA, all small holes ignored,
butignored, but large
large holes were holes were recognized
recognized as groupasof group
edges of and
edges and detected
detected as crack
as crack regions.
regions. As As a result,
a result, large
large
holes holes
were were detected
detected incorrectly.
incorrectly.
Sensors
Sensors 2019,
2019, 19,19, x FOR PEER REVIEW
4796 16 of15
20of 18
In addition, in the first row of the Figure 13b, the concrete image has both very thin and thick
In addition, in the first row of the Figure 13b, the concrete image has both very thin and thick
cracks
cracksand andthethethin
thincrack
crackisisignored
ignored when
when difference
difference in in thickness
thicknessisislarge.
large.TheThe second
second rowrow of the
of the
Figure
Figure13b is is
13b CCA
CCAand andCRACRAoutput
output image
image of of that
that input
inputconcrete
concreteimage.
image. When
When there
there is aissignificant
a significant
difference in thickness, the thin cracks are recognized as if the intermediate
difference in thickness, the thin cracks are recognized as if the intermediate connection is broken. connection is broken.
However, they are not clearly recognized even in the CCA network. In the CRA network, only
However, they are not clearly recognized even in the CCA network. In the CRA network, the the
only
thick
thick crack
crack regionwas
region wasrecognized,
recognized, andand the thin
thin crack
crackregion
regionwas wasignored.
ignored. Finally,
Finally, thethethinthin
cracks
cracks
recognized
recognized bybyCCACCAwere
werecompletely
completely eliminated
eliminated by by CRA.
CRA.
WeWe also
also conductexperiments
conduct experimentscalled
called inter-DB
inter-DB in inFigure
Figure1414totoseeseethe effective
the performance
effective performance of the
of the
proposed
proposed network.The
network. Theinter-DB
inter-DB denotes
denotes that
that the
thetrained
trainednetworks
networks are applied
are applied in in
testing,
testing,using the the
using
different
different datasetthat
dataset thathas
hasbeen
beennot
not used
used inin training.
training. The
Theinter-DB
inter-DBis ismore
more challenging
challenging because
becausethe the
network needs to be adapted to different properties. In Figure 14, we observe
network needs to be adapted to different properties. In Figure 14, we observe the network is efficiently the network is
efficiently applied to the new dataset named SDNET2018 [43]. The last two
applied to the new dataset named SDNET2018 [43]. The last two columns of Figure 14 show interesting columns of Figure 14
show In
results. interesting results. Inthere
the fifth column, the fifth column,
are no visiblethere
cracksarein
nothevisible cracks
images, butinthere
the images,
are some but therechanges.
color are
some color changes. The proposed network is able to capture the difference and show very few
The proposed network is able to capture the difference and show very few activations in the output.
activations in the output. In the last crack, there are color changes as well as cracks. In this case, the
In the last crack, there are color changes as well as cracks. In this case, the network can capture the
network can
Sensors 2019, 19,capture theREVIEW
x FOR PEER differences. These results show that the network can be efficiently applied 17 of 20
differences. These results show that the network can be efficiently applied in inter-DB as well.
in inter-DB as well.
Figure14.
Figure 14.Crack
Crackdetection
detection results
results on
onSDNE
SDNEdataset.
dataset.
6. Conclusion
We proposed an autonomous crack detection algorithm by using two-branched convolutional
neural network. The cracks cannot be distinguished with other noises when they are obtained from
image sensors. To efficiently recognize cracks, we designed a crack-component-aware (CCA)
Sensors 2019, 19, 4796 16 of 18
6. Conclusions
We proposed an autonomous crack detection algorithm by using two-branched convolutional
neural network. The cracks cannot be distinguished with other noises when they are obtained from
image sensors. To efficiently recognize cracks, we designed a crack-component-aware (CCA) network
with a u-shaped structure to train the edge features of the cracks. Crack-region-aware (CRA) network
emphasizes critical cracks and suppresses trivial noise. Through the combination of two sub-networks,
we could finally extract only the significant cracks. The proposed method requires some increments of
the computational complexity due to the deep learning architectures as compared to block-based crack
detection techniques. However, the proposed algorithm provided improved detection accuracies and
reliable detection performance as compared to the previous algorithms in different crack image datasets.
Author Contributions: Conceptualization, H.-S.K. and J.-W.K.; methodology and software, N.K. and J.L.;
validation, J.L. and E.-M.R.; formal analysis and investigation, N.K., J.L. and J.-W.K., resources, H.-S.K.; data
curation, N.K., J.L. and E.-M.R.; writing—original draft preparation, J.L.; writing—review and editing, J.-W.K.;
funding acquisition, J.-W.K. and H.-S.K.
Funding: This work has supported by the National Research Foundation of Korea (NRF) grant funded by the
Korea government (MSIT) (No. NRF-2019R1C1C1010249).
Acknowledgments: This work has supported by the National Research Foundation of Korea (NRF) grant funded
by the Korea government (MSIT) (No. NRF-2019R1C1C1010249).
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Prasanna, P.; Dana, K.J.; Gucunski, N.; Basily, B.B.; Hung, M.L.; Lim, R.S.; Parvardeh, H. Automated crack
detection on concrete bridges. IEEE Trans. Autom. Sci. Eng. 2016, 13, 591–599. [CrossRef]
2. Chen, F.-C.; Jahanshahi, M.R. Nb-cnn: Deep learning-based crack detection using convolutional neural
network and naive bayes data fusion. IEEE Trans. Ind. Electron. 2018, 65, 4392–4400. [CrossRef]
3. Zhang, L.; Yang, F.; Zhang, Y.D.; Zhu, Y.J. Road crack detection using deep convolutional neural network.
In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA,
25–28 September 2016; pp. 3708–3712.
4. Makantasis, K.; Protopapadakis, E.; Doulamis, A.; Doulamis, N.; Loupos, C. Deep convolutional neural
networks for efficient vision based tunnel inspection. In Proceedings of the 2015 IEEE International Conference
on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, 3–5 September
2015.
5. Korea Concrete Institute. Korea Structural Concrete Design Code 2012; English & Korean; Korea Concrete
Institute: Seoul, Korea, 2012.
6. Noh, Y.; Koo, D.; Kang, Y.-M.; Park, D.; Lee, D. Automatic crack detection on concrete images using
segmentation via fuzzy c-means clustering. In Proceedings of the 2017 International Conference on Applied
System Innovation (ICASI), Sapporo, Japan, 13–17 May 2017.
7. Youm, M.; Yun, H.; Jung, T.; Lee, G. High-speed crack detection of structure by computer vision. In Proceedings
of the KSCE 2015 Convention 2015 Civil Expo and Conference, Gunsan, Korea, 28–30 October 2015.
8. Song, Q.; Lin, G.; Ma, J.; Zhang, H. An edge-detection method based on adaptive canny algorithm and
iterative segmentation threshold. In Proceedings of the 2016 2nd International Conference on Control Science
and Systems Engineering (ICCSSE), Singapore, 27–29 July 2016.
9. Xie, S.; Tu, Z. Holistically-nested edge detection. In Proceedings of the 2015 IEEE International Conference
on Computer Vision, Santiago, Chile, 7–13 December 2015.
10. Liu, N.; Han, J.; Yang, M.-H. Picanet: Learning pixel-wise contextual attention for saliency detection.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT,
USA, 18–23 June 2018; pp. 3089–3098.
11. Chen, H.; Qi, X.; Yu, L.; Heng, P.-A. Dcan: Deep contour-aware networks for accurate gland segmentation.
In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA,
26 June–1 July 2016; pp. 2487–2496.
Sensors 2019, 19, 4796 17 of 18
12. Kim, H.; Ryu, E.; Lee, Y.; Kang, J.-W.; Lee, J. Performance evaluation of fire damaged reinforced concrete
beams using machine learning. In Proceedings of the 17th International Conference on Computing in Civil
and Bulding Engineering, Tampere, Finland, 5–7 June 2018.
13. Song, Q.; Wu, Y.; Xin, X.; Yang, L.; Yang, M.; Chen, H.; Liu, C.; Hu, M.; Chai, X.; Li, J. Real-time tunnel crack
analysis system via deep learning. IEEE Access 2019, 7, 64186–64197. [CrossRef]
14. Cha, Y.-J.; Choi, W.; Büyüköztürk, O. Deep learning-based crack damage detection using convolutional
neural networks. Comput. Aided Civ. Infrastruct. Eng. 2017, 32, 361–378. [CrossRef]
15. Kim, B.; Cho, S. Automated vision-based detection of cracks on concrete surfaces using a deep learning
technique. Sensors 2018, 18, 3452. [CrossRef] [PubMed]
16. Yokoyama, S.; Matsumoto, T. Development of an automatic detector of cracks in concrete using machine
learning. Procedia Eng. 2017, 171, 1250–1255. [CrossRef]
17. Silva, W.; Diogo, S. Concrete cracks detection based on deep learning image classification. Multidiscip. Digit.
Publ. Inst. Proc. 2018, 2, 489. [CrossRef]
18. Basu, M. Gaussian-based edge-detection methods—A survey. IEEE Trans. Syst. Man Cybern. 2002, 32,
252–260. [CrossRef]
19. Khan, A.; Sung, J.; Kang, J.-W. Multi-channel Fusion Convolutional Neural Network to Classify Syntactic
Anomaly from Language-Related ERP Components. Inf. Fusion 2019, 52, 53–61. [CrossRef]
20. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation.
In Proceedings of the MICCAI 2015, Munich, Germany, 5–9 October 2015.
21. Lim, R.S.; La, H.M.; Sheng, W. A robotic crack inspection and mapping system for bridge deck maintenance.
IEEE Trans. Autom. Sci. Eng. 2014, 11, 367–378. [CrossRef]
22. Cho, H.; Yoon, H.-J.; Jung, J.-Y. Image-based crack detection using crack width transform (cwt) algorithm.
IEEE Access 2019, 6, 60100–60114. [CrossRef]
23. Liang, S.; Jianchun, X.; Xun, Z. An algorithm for concrete crack extraction and identification based on
machine vision. IEEE Access 2018, 6, 28993–29002. [CrossRef]
24. Li, L.; Wang, Q.; Zhang, G.; Shi, L.; Dong, J.; Jia, P. A method of detecting the cracks of concrete undergo
high-temperature. Constr. Build. Mater. 2018, 162, 345–358. [CrossRef]
25. Zalama, E.; Gómez-García-Bermejo, J.; Medina, R.; Llamas, J. Road crack detection using visual features
extracted by gabor filters. Comput. Aided Civ. Infrastruct. Eng. 2014, 29, 342–358. [CrossRef]
26. Li, Y.; Li, H.; Wang, H. Pixel-Wise Crack Detection Using Deep Local Pattern Predictor for Robot Application.
Sensors 2018, 18, 3042. [CrossRef]
27. Chaudhury, S.; Nakano, G.; Takada, J.; Iketani, A. Spatial-temporal motion field analysis for pixelwise crack
detection on concrete surfaces. In Proceedings of the 2017 IEEE Winter Conference on Applications of
Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017; pp. 336–344.
28. Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate
shift. In Proceedings of the ICML 2015, Lille, France, 6–11 July 2015.
29. Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent
neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958.
30. Li, R.; Liu, W.; Yang, L.; Sun, S.; Hu, W.; Zhang, F.; Li, W. Deepunet: A deep fully convolutional network
for pixel-level sea-land segmentation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 3954–3962.
[CrossRef]
31. Lee, J.; Kang, M.; Kang, J.-W. Ensemble of Binary Tree Structured Deep Convolutional Network for Image
Classification. In Proceedings of the Asia-Pacific Signal and Information Processing Association (APSIPA),
Kuala Lumpur, Malaysia, 12–15 December 2017.
32. Mun, Y.J.; Kang, J.-W. Ensemble of Random Binary Output Encoding for Adversarial Robustness. IEEE Access
2019, 7, 124632–124640. [CrossRef]
33. Islam, M.; Sohaib, M.; Kim, J.; Kim, J. Crack Classification of a Pressure Vessel Using Feature Selection and
Deep Learning Methods. Sensors 2018, 18, 4379. [CrossRef]
34. Shi, Y.; Cui, L.; Qi, Z.; Meng, F.; Chen, Z. Automatic road crack detection using random structured forests.
IEEE Trans. Intell. Transp. Syst. 2016, 17, 3434–3445. [CrossRef]
35. Amhaz, R.; Chambon, S.; Idier, J.; Baltazart, V. Automatic crack detection on two-dimensional pavement
images: An algorithm based on minimal path selection. IEEE Trans. Intell. Transp. Syst. 2016, 17, 2718–2729.
[CrossRef]
Sensors 2019, 19, 4796 18 of 18
36. Dorafshan, S.; Thomas, R.J.; Maguire, M. Fatigue crack detection using unmanned aerial systems in fracture
critical inspection of steel bridges. J. Bridge Eng. 2018, 23, 04018078. [CrossRef]
37. Martin, D.; Fowlke, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application
to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the Eighth
IEEE International Conference on Computer Vision (ICCV), Vancouver, BC, Canada, 7–14 July 2001.
38. Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.;
Bernstein, M.; et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252.
[CrossRef]
39. Everingham, M.; Eslami, S.A.; van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object
classes challenge: A retrospective. Int. J. Comput. Vis. 2015, 111, 98–136. [CrossRef]
40. Staal, J.; Abràmoff, M.D.; Niemeijer, M.; Viergever, M.A.; van Ginneken, B. Ridge-based vessel segmentation
in color images of the retina. IEEE Trans. Med. Imaging 2004, 23, 501–509. [CrossRef] [PubMed]
41. Krizhevsky, A.; Sutskever, H.; Hintton, G.E. (Eds.) ImageNet Classification with Deep Convolutional Neural
Networks; NIPS: Lake Tahoe, CA, USA, 2012.
42. Springenberg, J.T.; Dosovitskiy, A.; Brox, T.; Riedmiller, M. Striving for simplicity: The all convolutional net.
arXiv 2014, arXiv:1412.6806.
43. Dorafshan, S.; Thomas, R.; Maguire, M. SDNET2018: An annotated image dataset for non-contact concrete
crack detection using deep convolutional neural networks. Data Brief 2018, 1664–1668. [CrossRef]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).