A Modified Image Processing Method For Deblurring Based On GAN Networks

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2019 5th International Conference on Big Data and Information Analytics

A Modified Image Processing Method for


Deblurring Based on GAN Networks
Xiaonan Zhang1 Yang Lv1 Yunjie Li1 Yu Liu1 Pengcheng Luo 2
Tel.: +86 18711132144 Tel.: +86 13050769808 Tel.: +86 18941930931 Tel.: +86 13387819333 Tel.: +86 13308492836
E-mail: E-mail: E-mail: E-mail: E-mail:
zhangxiaonan12@nudt. nudtlvyang@163.com leeyunj@yeah.net zhoujq8898@foxmail.c 379134575@qq.com
edu.cn om

1
Air Force Harbin Flight Academy, 111000, Liaoyang, China.
2
AllSim Technology Inc, No. 658 Lugu Avenue, Yuelu District, 410013, Changsha, China.

Abstract—In computer vision literature, it is really a challenging The blur can be typically modeled as the convolution of a
issue about removing the images blur resulted from camera point spread function with a hypothetical sharp input image,
shake. As the existing image deblurring methods do not apply to that is, the sharp image is convoluted with the blur kernel, and
the image degraded by partial motion blur, and the existing noise is added at the same time to obtain a blurred image. And
partial blur detection approaches only used low -level blur image deblurring refers to the inverse processing of blurred
features to measure the blurry degree of an image, the blur images, also known as image deconvolution. In real life, the
regions extracted via these methods usually have blur kernel is often unknown and it is quite hard to estimate
misclassification. To solve the imaging deblurring problem, we non-uniform kernel, which brings a lot of obstacles and
propose an image deblurring method based on Generative
challenges for researchers. Hence, there is an urgent need to
Adversarial Network (GAN) architecture using dual path
solve the problem in image blurring.
connection. In comparison with the traditional image
deblurring algorithm, this model can avoid the dependence on
Fortunately, because of the support of computer hardware
apriori-knowledge of blurred image. Experimental results show technology, the emergence of big data, and the prevalence of
that the proposed method significantly outperforms other state- machine learning algorithms, image deblurring technology
of-art algorithms on image deblurring. has made breakthrough progress in both theoretical and
practical applications. The deblurring algorithms proposed by
Keywords- image processing; imaging deblurring; artificial many scholars have been applied to daily life, including
intelligence; dual path connection; convolutional neural network medical imaging [1], traffic monitoring, aerospace [2], target
tracking [3], object recognition [4] and other fields. As science
I. INTRODUCTION and technology advance, image information will be applied to
a wider range of areas, as well as the construction of national
As virtual technology and mobile Internet advance, image economy and defense security. In the future, both the
plays an important role in the process of gaining and passing subjective visual effects and the objective evaluation criteria
information for people. However, there are too many of deblurring will be more stringent.
unforeseeable elements in real life which could bring about To make use of computing performance, in this paper, a
image blur, such as atmospheric turbulence effect, diffraction generative adversarial network (GAN) based on dual path
of optical system in camera equipment, aberration of optical network architecture was proposed to resolve this difficult
system, relative motion between imaging equipment and problem. The experiment got better results, compared with the
object. Degrade images always cannot meet the needs of most advanced blind deblurring algorithms, including two
human beings and even cause non-negligible impact on traditional mathematical methods and a deep-learning
subsequent image processing including image segmentation, method.
image feature extraction, and object tracking. The main contents of this paper are as follows: section II
Recent years, the study of image deblurring has gradually discusses the related work of image restoration. In section III,
become an important research branch in the field of digital a new method was introduced to tackle the problem of image
image processing. With regard to the research of deblurring, restoration for blur images. Section IV gives the experimental
most experts and scholars put forward certain assumptions on results and makes a simple analysis. Finally, section V draws
the basis of linear blur to process degrade images. Because the the conclusion.
proposed assumptions are different, the performances of each
deblur method are different. In actual situations, the type of
blur is not necessarily linear, nor does it necessarily conform
to the assumptions of the algorithm, which results in an
unsatisfactory deblurring effect in practical applications.

978-1-7281-3933-3/19/$31.00 ©2019 IEEE


29

Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on July 09,2021 at 21:00:24 UTC from IEEE Xplore. Restrictions apply.
2019 5th International Conference on Big Data and Information Analytics

II. RELATED WORK

A. Image Deblurring
Image deblurring or deconvolution is a important issue in
computer vision. According to whether the blur kernel is
known, there are two cases: blind deconvolution or non- perception quality, however, there are some problems in the
blind deconvolution. In most cases, we don’t know what type training of general version, such as gradient disappearance,
of blur is. Generally, deconvolution contains two parts: The mode collapse and so on.
first step is to estimate the PSF. The second step is to recovery
the original image by using non-blind deconvolution of C. Conditional Generative Adversarial Networks
estimated PSF, and the usual solution used here is The conditional generative adversarial networks (CGAN)
regularization. is a further development of the basic GAN [10]. Here y is
Prior information of original image and blur kernel are the added as conditional information to the generator and
key to solve ill-posed problem of image deconvolution. As the discriminator, and y can represent any information, such as
main part of Bayesian method, it has the function of classification information or other data. As shown in Figure 1,
estimating results accurately and speeding up the iteration CGAN achieves its function by sending extra information y as
process. In practical terms, according to the content of part of the input layer to the generator and discriminator. In
regularization methods, the regularizers whose prior the generator, a common hidden layer feature is formed by
distributions are converted into corresponding forms are combining conditional information y with prior input noise
carried out according to their random properties. Constraints p(z). The adversarial training architecture shows quite
are generated in regularization optimization problems due to adaptable features in the composition of hidden layer. Like
these regularizers. The total variability (TV) norm is the most GAN, the objective function of conditional GAN is as
famous and common used image regularizer[6]. The edges follows:
and details of the images can be preserved in minimizing total
variation. Hyper-Laplacian [5], Wavelet-based analysis minmaxV(D, G) =  [log(D(x | y))] −  [log(1− D(x | y))] (2)
methods [7], or even compound regularizers [8] are also G D x ~r x ~g

feasible. From another aspect, regularization of the blur kernel


is not only related to specific tasks, but also tend to be sparse
and non-negative.
B. Generative Adversarial Networks
Generative Adversarial Network (GAN) is a deep learning
model and one of the most promising algorithms of
unsupervised machine learning for complex distribution in
recent years. It is usually implemented by two competing
systems of neural networks in a zero-sum game framework
[21]. In October 2014, Ian J. Goodfellow et al. proposed a new
framework for estimating the generated model through the
confrontation process. In the framework, two models are
trained simultaneously: the generative model G(generator) for
capturing data distribution and the discriminative model
D(discriminator) for estimating the probability of samples
coming from training data [9]. The training objective of G is
to maximize the probability of D’s estimation error. This
framework corresponds to a minimax objective game of two-
sided confrontation. It can be proved that in the space of Figure 1. The conditional adversarial net
arbitrary functions G and D, there is only one solution to make
G reproduce the distribution of training data, while D = 0.5.
At present, the most common application of GAN is image
generation, such as super-resolution task, semantic III. PROPOSED METHOD
segmentation and so on.
Our goal is to provide only a blurred image IB with
unknown blurred kernel information as an input, then get a
min max V( D , G ) =  [log( D ( x ))] −  [log(1 − D ( x ))] (1) restore sharp image IS. Unlike traditional methods
G D x ~ r x ~ g
DeblurGAN[11] which apply residual neural network as
model framework, we hope that the model can create new
where Pr is the data distribution and Pg is the model feature information under the condition of avoiding
distribution, defined by ‫ݔ‬෤= G(z); z ~ P (z), the input z is a redundancy. For this purpose, we propose a “generative
sample from a distribution of noise. The most well-known network” using dual path architecture [19].
feature of GANs is that it can produce samples with excellent

30

Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on July 09,2021 at 21:00:24 UTC from IEEE Xplore. Restrictions apply.
2019 5th International Conference on Big Data and Information Analytics

A. Architecture current moment; xt represents the input of t time; ftk(‫)ڄ‬


Our architecture comprises a discriminator and a generator represents feature extraction; gk represents the transformation
with the dual path network. The generator's duty is to cycle before the output of the extracted feature.
the features on multiple acceptance scales and cheat the As shown in Figure 3, each DPN Stage represents a level
discriminator network by generating a new synthetic image, of convolution module, which consists of two 1×1
which seems to come from the real target distribution. convolutions, one 3×3 convolution, three BatchNorm layers
Therefore, if the image is blurred, we can generate and three ReLU layers. For input x to enter a DPN Stage, it is
deblurred images that are visually appealing and statistically divided into two parts according to the channel: data_o1 and
consistent. The task of the discriminator is to make a decision data_o2, and the two parts are operated simultaneously. The
by analyzing different patches in each image so as to correctly first part is similar to ResNet, keeping data_o1 and data_o2
identify which distribution each input image comes from. The themselves without any operation; the second part will go
following section gives the specific details of our generator through 1×1 convolution, 3×3 convolution, 1×1 convolution,
and discriminator model. and then do channel branching again to get C1 and C2 parts;
1) The Generator then C1 and data_o1 do element-wise addition (๨ symbol in
Generator architecture is shown in Figure 3. We followed the figure) to get new data_o1, C2 and data_o2 do concat
the methods of DeblurGan proposed by Orest Kupyn et al.- operation ([ ] symbol in the figure) to get new data_o2. Then
2017 [11]. It contains four simples 3×3 convolutional layer, a they are passed as new input into the next DPN Stage. It is
1×1 convolutional block, eleven DPN stages and two worth noting that we add dropout layer with a probability of
transposed convolution blocks. Dual Path Network (DPN) 0.5 after the first convolution layer in each DPN Stage. In
combines the advantages of the ResNet and DenseNet. It can addition, DPN has faster training speed and richer feature, so
be understood that the core content of DenseNet is introduced it has better deblurring performance.
on the basis of ResNet to make full use of the features in the
model. It not only effectively utilizes high-level information
to discover new features again, but also reduces model’s
redundancy. DPN is a new network topology with high
performance and low resource occupation, which effectively
integrates the advantages of existing networks. Its
performance has been verified in three tasks of "image
recognition", "image detection" and "image segmentation",
and each indicator has been significantly improved. The Figure 2. llustration of dual path connection
mathematical form is as follows:
2) The Discriminator
k −1 In our GAN architecture, the discriminator is the main
x k  ¦ ft k (h t ) module that promotes the generator to create statistical
t =1 information for the restored image. Moreover, we need a
k −1 classifier that can accurately classify but not overly
y k  ¦ vt (ht ) = y k −1 + φ k −1 ( y k −1 ) (3) distinguish easier tasks because of its depth. Therefore, we
t =1
define a discriminator like DeblurGAN with gradient penalty
r k  xk + y k method [13]. The framework of critic network is the same as
h k = g k (r k ) PatchGAN [10,14], which is considered to be the most
effective structure in image generation.
where xk represents DenseNet, yk represents ResNet; ht
represents the hidden layer state of t time; k represents the

Figure 3. Our Convolutional Neural Network Architecture

31

Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on July 09,2021 at 21:00:24 UTC from IEEE Xplore. Restrictions apply.
2019 5th International Conference on Big Data and Information Analytics

smooth and lack details, and using L2 loss solely often fail to
B. Loss Functions get a real sparse model. Therefore, we add both to the =net
1) Content Loss loss function given in Eqn. 4 in order to remove these
Generally, the L1 or L2 loss is applied for image limitations by simultaneously leveraging L1 loss and
deblurring as the primary objective function between restored perceptual loss.
image and ground truth. However, simply using L1 loss or L2 Perceptual loss is a simple L2 loss, but based on the
loss will cause some problems. Because L1 loss is sparse and difference of the generated and target image CNN feature
the gradient is not smooth at zero point, this means only using maps. It is defined as following:
L1 loss in deep neural network model will make the image too
subset amounts to 1430 pictures. For each resized image, a
1
Wi , j H i , j trajectory of motion was generated. The trajectory was
$ percep (VGG / i , j ) =
Wi , j H i , j
¦ ¦ (φ
x =1 y =1
i, j ( I S ) x , y −φi , j (GθG ( I B )) x , y ) 2 (4) estimated by modeling a particle’s motion affected by inertial,
impulsive and Gaussian perturbations. The PSFs were used as
blur kernels. Each image from the dataset was convolved with
Where, Wi,j , Hi,j are the width and height of the (i , j)th its corresponding group of four blur kernels, thus producing a
ReLU layer of VGG-19 network [15] and¶i,j is the feature dataset of 5720 images composed of blurred images and
map obtained by the j-th convolution (after activation) before corresponding ground truth images. Gaussian and Poisson
the i-th maxpooling layerwithin the VGG19 network, noise were subsequently added to them. The train-test-
pretrained on ImageNet, IB is a blurred image, IS is a sharp validation split for this project was 80-10-10, meaning there
latent image. were 4576 images in the training set, and 572 images in both
2) Conditional Adversarial Loss test and validation sets.
In our GAN architecture, two image pairs are sent to the
B. Training Details
discriminator. The blurred image as input and the
corresponding images created by generator as output form For software facilities, our experiment runs on a deep
one pair, while the blurred image as input and the ground learning framework called "Pytorch". For hardware facilities,
truth image form another pair. If the conditional distribution all operations are based on a desktop computer with a Nvidia
of the input image is the same as that of the potential image GTX 1080 Ti GPU, 16GB RAM and i7 processor. We use the
modeled by the generator, the result will make the generated adaptive learning method, and each time we start training
image consistent with its output to a high degree at a given based on the weight obtained from the previous training
input. This is in line with our basic needs, 'G' can keep the instead of restarting. The initial value of learning rate was set
dependence of output on blurred input to adapt to various to 0.0002 in both generator and discriminator. After the first
quantities and kinds of shake blur, while avoiding it swinging 120 epoch, the learning rate will be linearly reduced to zero in
too far in the process of cheating discriminator is exactly the next cycle. The Adam optimizer with stochastic gradient
what we want. Therefore, conditional GAN can be descent (SGD) is used to train the loss function. In order to fit
considered as the "relevance regularizer" in our network. GPU RAM, we selected the optimal batch size for network
Specifically, the minimized-conditional loss function can be processing. It was found that when a batch size = 2, the
given by the following formula: experimental results were the best, so it was chosen as the
batch size for each training.
$GAN = −  [log Dθ (Gθ ( I B ) | I B )] (5) C. Results and Analysis
I ∈I B D G

Our model is not only compared with the traditional


Hence, the comprehensive loss function of our model is, mathematical advanced deblurring algorithm given by
Sroubek et al. [16], but also with the deep-learning algorithms
proposed by Sun[17] and Nah[18]. We choose PSNR and
$net = $GAN + (K1 )$percep + (K2 )$L1 (6)
SSIM as experimental evaluating indicators. Specifically, the
larger values of two indicators we got, the higher quality of
where, K1 and K2 are hypermeters which are set to 130 and the corresponding images we generated. The final
155 respectively in our experiments. experimental results are summarized in Table 1.
The results infer that our method significantly outperforms
IV. EXPERIMENTS the other methods. In other words, compared with these
In this section, a series of experiment results prove that our methods of the-state-of-the-art, our model produces superior
DPN-based model is better than other advanced algorithms. results. The visual effects compared with different methods
are shown in Figure 4. As can be seen from the figure, our
A. Dataset method produces better visual effect than other advanced
Images from ImageNet were used to form the dataset methods.
(which are resized to 256 × 256 × 3). Specifically, the subset
of images of ImageNet dataset were randomly selected. The TABLE I. SUMMARY OF OUR RESULTS ON TWO EVALUATING
INDICATOR OF ESTIMATED IMAGES

32

Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on July 09,2021 at 21:00:24 UTC from IEEE Xplore. Restrictions apply.
2019 5th International Conference on Big Data and Information Analytics

Sroubek et al. Sun et al. Nah et al. Ours framework based on dual path connection. The experimental
PSNR 18.36 24.82 27.57 27.98 results show that our model has excellent performance in the
SSIM 0.553 0.764 0.832 0.847 field of image deblurring, compared with the state-of-the-art.
However, images of pure white background may appear
V. CONCLUSION AND FUTURE WORK during the deblurring process for few images due to some
Our method exploits the property of conditional unknown reasons. In the future work, we need to do some
adversarial network, which is a kernel-free blind motion modification in next iteration of the project, and make
deblurring learning approach compared with traditional improvements by exploring and developing more efficient
mathematical methods, proposes a novel conditional GAN loss functions.

Figure 4. Comparison of deblurred images by our model and other algorithms. From left to right: blurred photo, result of Sroubek et al. [16] ,result of Nah
et al. [18], result of our algorithm.

[7] Elad M, Milanfar P, Rubinstein R. Analysis versus synthesis in


REFERENCES signal priors[C]// Signal Processing Conference, 2006, European.
IEEE, 2015:947.
[1] Michailovich, O. V., and D. Adam. "A novel approach to the 2-D
blind deconvolution problem in medical ultrasound." IEEE [8] Bioucas-Dias J M, Figueiredo M A T. An iterative algorithm for
Transactions on Medical Imaging 24.1(2005):86. linear inverse problems with compound regularizers[J]. 2009:685-
688.
[2] Krist, J. "Simulation of HST PSFs using Tiny Tim." 77(1995):349.
[9] Goodfellow, Ian J, et al. "Generative Adversarial Networks."
[3] Wu, Yi, et al. "Blurred target tracking by Blur-driven Tracker." Advances in Neural Information Processing Systems 3(2014):2672-
IEEE International Conference on Computer Vision IEEE, 2680.
2011:1100-1107.
[10] Isola, Phillip, et al. "Image-to-Image Translation with Conditional
[4] Li, Haoxiang, et al. "A convolutional neural network cascade for Adversarial Networks." (2016):5967-5976.
face detection." Computer Vision and Pattern Recognition IEEE,
2015:5325-5334. [11] Kupyn, Orest, et al. "DeblurGAN: Blind Motion Deblurring Using
Conditional Adversarial Networks." (2017).
[5] D. Krishnan, R. Fergus, Fast image deconvolution using hyper-
Laplacian priors. 2009. In: NIPS, 22 (2009) 1-9. [12] Arjovsky, Martin, S. Chintala, and L. Bottou. "Wasserstein GAN."
(2017).
[6] Chan T, Esedoglu S, Park F, et al. 2006. Total Variation Image
Restoration: Overview and Recent Developments[M]// Handbook [13] Gulrajani, Ishaan, et al. "Improved Training of Wasserstein GANs."
of Mathematical Models in Computer Vision. 2006:17-31. (2017).

33

Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on July 09,2021 at 21:00:24 UTC from IEEE Xplore. Restrictions apply.
2019 5th International Conference on Big Data and Information Analytics

[14] Li, Chuan, and M. Wand. "Precomputed Real-Time Texture


Synthesis with Markovian Generative Adversarial Networks."
(2016):702-716.
[15] Deng, Jia, et al. "ImageNet: A large-scale hierarchical image
database." Computer Vision and Pattern Recognition, 2009. CVPR
2009. IEEE Conference on IEEE, 2009:248-255.
[16] Sroubek F, Milanfar P. Robust multichannel blind deconvolution via
fast alternating minimization. [J]. IEEE Transactions on Image
Processing, 2012, 21(4):1687-1700.
[17] Sun, Jian, et al. "Learning a convolutional neural network for non-
uniform motion blur removal." Computer Vision and Pattern
Recognition IEEE, 2015:769-777.
[18] Nah, Seungjun, T. H. Kim, and K. M. Lee. "Deep Multi-scale
Convolutional Neural Network for Dynamic Scene Deblurring."
(2016):257-265.
[19] Chen, Yunpeng, et al. "Dual Path Networks." (2017).
[20] Zhu, Wentao, et al. "DeepLung: Deep 3D Dual Path Nets for
Automated Pulmonary Nodule Detection and Classification."
(2018).
[21] Generative adversarial network,
https://en.wikipedia.org/wiki?curid=50073184
[22] Ramakrishnan, Sainandan, et al. "Deep generative filter for motion
deblurring." arXiv preprint arXiv:1709.03481 (2017).
[23] Wang, Kunfeng, et al. "Generative Adversarial Networks for
Parallel Vision." Chinese Automation Congress 2018.
[24] Zhu, Wentao, et al. "Deeplung: Deep 3d dual path nets for
automated pulmonary nodule detection and classification." arXiv
preprint arXiv:1801.09555 (2018)

34

Authorized licensed use limited to: ULAKBIM UASL - DOGU AKDENIZ UNIV. Downloaded on July 09,2021 at 21:00:24 UTC from IEEE Xplore. Restrictions apply.

You might also like