Proceedings of Spie: A High-Resolution Image Dehazing GAN Model in Icing Meteorological Environment
Proceedings of Spie: A High-Resolution Image Dehazing GAN Model in Icing Meteorological Environment
Proceedings of Spie: A High-Resolution Image Dehazing GAN Model in Icing Meteorological Environment
SPIEDigitalLibrary.org/conference-proceedings-of-spie
Xinling Yang, Wenjun Zhou, Chenglin Zuo, Yifan Wang, Bo Peng, "A high-
resolution image dehazing GAN model in icing meteorological environment,"
Proc. SPIE 12783, International Conference on Images, Signals, and
Computing (ICISC 2023), 1278305 (21 August 2023); doi:
10.1117/12.2691796
ABSTRACT
In this paper, we propose a high-resolution GAN model for image dehazing in icing meteorological environment, which
strictly follows a physics-driven scattering strategy. First of all, the utilization of sub-pixel convolution realizes the model
to remove image artifacts and generate high-resolution images. Secondly, we use Patch-GAN in the discriminator to drive
the generator to generate a haze-free image by capturing the details and local information of the image. Furthermore, to
restore the texture information of the hazy image and reduce color distortion, the model is jointly trained by multiple loss
functions. Experiments show the proposed method achieves advanced performance for image dehazing in icing weather
environment.
Keywords: Generative adversarial networks, Image dehazing, Sub-pixel convolution, High resolution
1. INTRODUCTION
When an aircraft is flying at high altitude, contacting with water vapor in the air will cause icing on the surface of the
aircraft, and the icing of the aircraft will greatly affect flight safety. During the test of this scene in the icing wind tunnel,
there will be suspended water droplets with a certain liquid water content in the test section. When the light passes through
the water droplets in the test section, the generated image will be blurred due to the attenuation of light scattering and
absorption, resulting in a loss of image quality. These images usually have distorted colors, decreased contrast, lost edge
and texture information, and cannot accurately observe the conditions of icy areas. Therefore, it is of practical significance
to apply image dehazing to improve the quality of monitoring images in icing wind tunnel test.
In common dehazing algorithm, an atmospheric scattering model 1 is usually used to describe the relationship between the
hazy image and the haze-free image:
𝐼(𝑥) = 𝐽(𝑥)𝑡(𝑥) + 𝐴(1 − 𝑡(𝑥)) (1)
where, 𝐽(𝑥) represents the haze-free image, 𝐼(𝑥) represents the collected hazy image, 𝑡(𝑥) represents the transmission
map, and A represents the atmospheric illumination value. The essence of image dehazing is the process of restoring the
hazy image to the fogless image infinitely. After 𝐼(𝑥) is given, in order to find 𝐽(𝑥) for the reverse solution, we usually
focus on finding the value of both 𝑡(𝑥) and 𝐴. Image dehazing is a highly ill-posed problem.
Many traditional algorithms for image dehazing focus on solving the transmittance and atmospheric illumination values in
the atmospheric scattering model. Representative algorithms include dark channel prior dehazing algorithm DCP2 and
color attenuation prior dehazing algorithm CAP. 3 In recent years, with the rise of deep learning methods, algorithms for
image dehazing have also achieved good results. Among them, there are also many dehaze maps that combine neural
networks with prior knowledge, that is, estimate transmittance and atmospheric illumination values through network
learning, and then obtain dehaze maps by inversely solving atmospheric scattering models, such as DehazeNet, 4 multi-
scale depth dehaze network MSCNN,5 dense pyramid dehazing algorithm DCPDN6 and At-DH,7 etc.
In icy weather conditions, the concentration of hazy is often relatively high. For this dense hazy situation, we learned the
related method At-DH in the NTIRE19 dehazing challenge. The dehazing data set DenseHaze 8 in NTIRE19 is
characterized by dense and uniform hazy, which are similar with the pictures obtained under the environment. The
*Wenjun Zhou: E-mail: zhouwenjun@swpu.edu.cn
International Conference on Images, Signals, and Computing (ICISC 2023), edited by Qian He,
Ioannis Kypraios, Lipo Wang, Proc. of SPIE Vol. 12783, 1278305
© 2023 SPIE · 0277-786X · doi: 10.1117/12.2691796
algorithm At-DH has achieved good results in this challenge, and the dehazing effect is obvious. Therefore, inspired by
the algorithm At-DH, in this paper we propose a high-resolution GAN model with dense connection for image dehazing
in icing meteorological environment. According to the inverse problem of image dehazing, we estimate the transmittance
and atmospheric illumination value through network learning, reverse the atmospheric scattering model to perform the
dehazing work, and strictly follow the physics-driven scattering model to achieve better dehazing effect. And the
generation confrontation network GAN has great potential in the research direction of real image restoration. We combine
this strategy with image dehazing to generate a haze-free image that is closer to the real one. The main contributions of
this paper are as follows:
1. A high-resolution GAN model for image dehazing is proposed in this paper to obtain relatively dehazed images in icing
meteorological environment.
2. The proposed model can eliminate the artifacts caused by traditional de-convolution to a certain extent and help to
obtain high-resolution dehazed image.
2. RELATED WORK
Image dehazing algorithms are generally divided into prior knowledge-based and learning-based methods. The dehazing
algorithm based on prior knowledge can be divided into physical model algorithm and non-physical model algorithm: the
physical model dehazing algorithm is based on the atmospheric scattering model, and the related parameters 𝑡(𝑥) and 𝐴
value are obtained through prior knowledge. The hazy-free image obtained by Equation (1), such as algorithms DCP,
CAP, etc.; the non-physical model defogging algorithm uses image enhancement methods to dehaze, such as Retinex
defogging algorithm,9 histogram equalization,10 wavelet and homomorphic filtering algorithms. At the same time, the
learning-based dehazing algorithm can also be similarly divided into two parts, the estimated parameter method and the
direct restoration method: the estimated parameter method is to estimate 𝑡(𝑥) and 𝐴 through network learning to perform
defogging, and the parameters estimated by using deep learning are generally more accurate than the traditional ones, such
as MSCNN,5 DehazeNet4 and DCPDN;6 the direct repair method is to directly learn and estimate the output dehaze image
from the input fog image through the network, such as FD-GAN,11 GridDehazeNet12 and FFA-Net.13
3. METHODOLOGY
In this section, we will introduce our network structure in detail. The network is based on the generation confrontation
network GAN, which consists of two parts: the generator and the discriminator. The atmospheric scattering model
inversely solves the fog-free image, and the discriminator distinguishes the hazy-free image and its corresponding real
hazy-free image. The whole framework is shown in Fig. 1.
3.1 Generator
The specific network structure diagram of the generator is shown in Fig. 2. The main function of the encoder on the left
is to extract important features of the picture, while the decoder on the right estimates’ scene information based on the
features extracted from the encoder, and at the same time restores the original size of the picture. The encoder is constructed
based on the densely connected network DCN,14 mainly the densely connected module (Dense_Block) and the
transmission module (Trans_Block). The structure of the decoder is similar to that of the encoder, including the bottleneck
layer(Bottleneck Block),7 the transmission layer (Transition Block), the residual layer (Residual Block) and the refinement
layer (Refine Block).6
3.1.1 Encoder
In the network structure, the encoder’s Base Block to the third Trans_Block use the pre-training parameters of the first
half of the DCN, because the structure in the network realizes the splicing of features on the channel, thereby achieving
the effect of feature reuse. Using pre-trained connected blocks in the dehazing work helps to obtain important features.
3.1.2 Decoder
The main function of Transition Block is to enlarge and change the refinement features. The channel change is completed
by the 1×1 convolutional layer in the Transition Block, and then the feature is enlarged through upsampling. Compared
with Dens Block in the encoder, Bottleneck Block adds batch normalization once to normalize the training data, so that
the network has better training stability and avoids gradient explosion. Adding a residual layer between two consecutive
dense blocks helps recover more details in the image by extracting more high-frequency information.
Refine Block merges and retouches image information at different scales. The addition of the residual network enables
superimposing the identity mapping layer on the shallow network so that the network does not degrade as the depth
increases.
3.1.3 Sub-pixel convolution
Sub-pixel convolution15 is applied in the field of image super-resolution, which can super-resolve low resolution data to
high resolution space, and is an upsampling method of pixel rearrangement. The sub-pixel convolution process is described
as follows:
𝐼 𝐻𝑅 = 𝑓 𝐿 (𝐼 𝐿𝑅 ) = 𝑃𝑆(𝑊𝐿 × 𝑓 𝐿−1 (𝐼 𝐿𝑅 ) + 𝑏𝐿 ) (2)
𝐼 𝐻𝑅 is a high-resolution image, 𝐼 𝐿𝑅 is a low-resolution image, 𝑓 is a convolution operation, 𝑊𝐿 is the weight of the
convolution kernel, 𝑏𝐿 is the bias item, and 𝑃𝑆 is the pixel reorganization operation.
The pixel reorganization operation is to take an element from each channel of the multi-channel feature map and combine
it into a square unit on the new feature map. The pixels on the original feature map are equivalent to the sub-pixels on the
new feature map. In this paper, we replace the original upsampling layer with the sub-pixel convolution layer, so that there
Figure 3. Discriminator. It is beneficial to drive the generator to capture the local information of the image and generate
a relatively high-resolution haze-free image.
3.2 Discriminator
Here the discriminator introduces PatchGAN.16 Compared with the usual discriminator, it discriminates the image by sub-
area matrix, and finally takes the average result of all the matrices to output true and false. It can pay more attention to
the image details when training the model, and obtains higher resolution pictures. Since Patch-GAN has fewer parameters
and runs fast, it can be applied to pictures of any size. The discriminator consists of a series of convolutional layers, batch
normalization layers, and activation layers, where as shown in Fig. 3.
3.3 Loss function
The loss function is used to standardize the learning direction of parameters in network training. In order to better train the
network and generate images with good dehazing effect, we use three common losses: reconstruction loss, perception loss
and confrontation loss.
3.3.1 Reconstruction loss
We use the reconstruction loss to compare the gap between the generated dehazed image and the real haze-free image in
image pixel space, which can be expressed as:
1
𝐿𝑅𝑒𝑠 = ∑𝑁
𝑖=1||𝐺(𝐼𝑖 ) − 𝐽𝑖 ||1 (3)
𝑁
where 𝐼𝑖 represents the input foggy image, 𝐽𝑖 represents the real fog-free image corresponding to the image, and 𝐺(𝐼𝑖 )
represents the dehazing generated by the generator picture.
3.3.2 Perceptual loss
We also use perceptual loss here to measure the perceptual similarity in the feature space of the dehazed image and the
haze-free image. The specific implementation is to evaluate the parameters of the VGG16 pre-trained network model as
follows:
1 2
𝐿𝑝 = ∑𝑁
𝑖=1 ||𝜙(𝐺(𝐼𝑖 )) − 𝜙(𝐽𝑖 )|| (4)
𝑁 2
where 𝜙(. ) represents the feature map obtained from the VGG16 network layer.
3.3.3 Adversarial loss
In the generated confrontation network, in order to restore the authenticity of the image, the confrontation loss is the most
commonly used type of loss as follows, and the binary cross-entropy function is used to calculate the loss value.
1
𝐿𝐴 = ∑𝑁
𝑖=1 log (1 − 𝐷(𝐺(𝐼𝑖 ))) (5)
𝑁
4. EXPERIMENT
4.1 Settings
The hazy images involved in this paper are images taken from multiple angles in the icing wind tunnel experimental scene
supported by the Key Laboratory of Icing and Anti/De-icing of CARDC. Some pictures are shown in Fig. 4. In order to
simulate the state of the aircraft passing through the cloud layer containing super-cooled water droplets, cloud field with
a certain water droplet diameter MVD (median volume diameter) and water content LWC (liquid water content) can be
selected by adjusting the water pressure and air pressure of the nozzle during the icing wind tunnel test. Both MVD and
LWC are important parameters for determining the haze of cloud field. The data sets we use here include foggy images
with MVD of 25 𝜇𝑚 and LWC of 1.31 𝑔/𝑚3 , MVD of 22 𝜇𝑚 and LWC of 1.19 𝑔/𝑚3 , and MVD of 20 𝜇𝑚 and LWC of
1.0 𝑔/𝑚3 and 0.5 𝑔/𝑚3 respectively.
In this paper, we selected 310 cropped foggy images in the icy wind tunnel as the training set and trained them on the GPU.
During the training, the input image size will be resized to 1024×1024. The Adam optimizer is used for the generator and
the discriminator, and the learning rate of both is set to 10 −4, and a total of 100 epochs are iterated. The experiment runs
in the environment of P6000 GPU, 24GB memory to train the model.
4.2 Evaluation metrics
Since there is no corresponding real hazy-free image, in our experiment, the widely used visible edge gradient method17
was used to evaluate the experimental results. There are two important indicators: the ratio of the number of visible edges
𝑒 and the average value of the regularized visible edge gradient 𝑟. As the example shown in Fig. 5, after dehazing, the
overall contrast of the image is enhanced, and the number of visible edges measured increases.
𝑛0 and 𝑛𝑟 represent the number of visible edges before and after image dehazing, respectively, and 𝑒 represents the ability
of the algorithm to restore invisible edges in the image. The larger the value, the better the dehazing effect.
1
𝑟 = 𝑒𝑥𝑝 ( ∑𝑃𝑖∈𝑌𝑟 𝑙𝑜𝑔 𝑟𝑖 ) (8)
𝑛𝑟
𝑌𝑟 is the set of pixels on the visible edge of the image after dehazing, 𝑟𝑖 is the ratio of the gradient of the dehazing image
and the foggy image at pixel point 𝑃𝑖 . The larger the value of 𝑟, the higher the contrast of the image after defogging, and
the better the effect.
Figure 6. Comparison of six real hazy images in an icing wind tunnel with existing traditional methods. (1) and (2):
LWC=1.31; (3) and (4): LWC=1.0; (5) and (6): LWC=0.5.
Figure 7. Comparison of six real hazy images in an icing wind tunnel with existing deep learning methods. (1) and (2):
LWC=1.31; (3) and (4): LWC=1.0; (5) and (6): LWC=0.5.
ACKNOWLEDGMENTS
This work was supported by Key Laboratory of Icing and Anti/De-icing of CARDC (Grant No. IADL20210203) and
Natural Science Foundation of Sichuan, China under Grant 2023NSFSC0504. All data in this paper are supported by Key
Laboratory of Icing and Anti/De-icing of CARDC. Moreover, the authors would like to acknowledge the following people
for their assistance: Dr. Quan Zhang, Jiatian Wei, Tianfei Wang, and Shun Wang.
REFERENCES
[1] McCartney, E. J., “Optics of the atmosphere: scattering by molecules and particles,” New York (1976).
[2] He, K., Sun, J., and Tang, X., “Single image haze removal using dark channel prior,” IEEE transactions on pattern
analysis and machine intelligence 33(12), 2341–2353 (2010).
[3] Zhu, Q., Mai, J., and Shao, L., “A fast single image haze removal algorithm using color attenuation prior,” IEEE
transactions on image processing 24(11), 3522–3533 (2015).
[4] Cai, B., Xu, X., Jia, K., Qing, C., and Tao, D., “Dehazenet: An end-to-end system for single image haze removal,”
IEEE Transactions on Image Processing 25(11), 5187–5198 (2016).
[5] Ren, W., Liu, S., Zhang, H., Pan, J., Cao, X., and Yang, M.-H., “Single image dehazing via multi-scale
convolutional neural networks,” in [European conference on computer vision], 154–169, Springer (2016).
[6] Zhang, H. and Patel, V. M., “Densely connected pyramid dehazing network,” in [Proceedings of the IEEE
conference on computer vision and pattern recognition], 3194–3203 (2018).
[7] Guo, T., Li, X., Cherukuri, V., and Monga, V., “Dense scene information estimation network for dehazing,” in
[Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops], 0–0 (2019).
[8] Ancuti, C. O., Ancuti, C., Sbert, M., and Timofte, R., “Dense haze: A benchmark for image dehazing with dense-
haze and haze-free images,” arXiv (2019).
[9] Jobson, D. J., Rahman, Z.-u., and Woodell, G. A., “A multiscale retinex for bridging the gap between color
images and the human observation of scenes,” IEEE Transactions on Image processing 6(7), 965–976 (1997).
[10] Soni, B. and Mathur, P., “An improved image dehazing technique using clahe and guided filter,” in [2020 7th
International Conference on Signal Processing and Integrated Networks (SPIN)], 902–907, IEEE (2020).
[11] Dong, Y., Liu, Y., Zhang, H., Chen, S., and Qiao, Y., “Fd-gan: Generative adversarial networks with fusion-
discriminator for single image dehazing,” in [Proceedings of the AAAI Conference on Artificial Intelligence],
34(07), 10729–10736 (2020).
[12] Liu, X., Ma, Y., Shi, Z., and Chen, J., “Griddehazenet: Attention-based multi-scale network for image de- hazing,”
in [Proceedings of the IEEE/CVF international conference on computer vision], 7314–7323 (2019).
[13] Qin, X., Wang, Z., Bai, Y., Xie, X., and Jia, H., “Ffa-net: Feature fusion attention network for single image
dehazing,” in [Proceedings of the AAAI Conference on Artificial Intelligence], 34(07), 11908–11915 (2020).
[14] Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q., “Densely connected convolutional net- works,”
in Proceedings of the IEEE conference on computer vision and pattern recognition], 4700–4708 (2017).
[15] Shi, W., Caballero, J., Husz´ar, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., and Wang, Z., “Real- time
single image and video super-resolution using an efficient sub-pixel convolutional neural network,” in
[Proceedings of the IEEE conference on computer vision and pattern recognition], 1874–1883 (2016).
[16] Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A., “Unpaired image-to-image translation using cycle-consistent
adversarial networks,” in [Computer Vision (ICCV), 2017 IEEE International Conference on], (2017).