SINGLE IMAGE SUPER-RESOLUTION USING DEEP LEARNING
Dmitry Korobchenko, Marco Foco
NVIDIA
June 2017
GOAL ARCHITECTURE GAN
Obtain high-resolution image by a given low- • G(x) = Bicubic(x) + DNN(x) Photorealistic image features could be boosted
resolution image. • Downscaling layers (D): to increase receptive by means of Generative Adversarial Networks.
field and capture more semantic features • Generator: our pretrained super-res model
• Skipped connections (S): to propagate low- • Discriminator: binary classifier to distinguish
level features and avoid loss of details after upscaled from real high-resolution images
Upscale downscaling
Generator 𝐺(𝑥)
• Residual blocks (R): to improve convergence 𝑥
and increase number of layers 𝐺(𝑥)
LOW-RES INPUT
R Discriminator
J-Res-Net* 4x R U 𝑦 𝐷(𝑦)
Real high-res images
DEEP LEARNING 𝑥 R
D
S
U
R U
Real high-res or
generated?
R S R
APPROACH D R U GAN loss = −𝑙𝑛𝐷(𝐺 𝑥 )
Our super-resolution model is based on deep Bicubic Total loss = Content loss + GAN loss
neural network. It is trained in end-to-end
fashion to produce high-resolution output from
a given low-resolution input by minimizing a LOSS FUNCTION RESULTS
distance from the output to the ground-truth. Mean values for Set5+Set14+BSDS100 datasets*** 4X UPSCALING
• MSE loss: corresponds to PSNR metric, which
Ground-truth poorly represents perceptual image quality 4x PSNR 4x SSIM 4x HFENI**
• HFENN loss:
27.63 NVIDIA 0.7432 NVIDIA 2.2634 NVIDIA
𝑦ො
27.60 LapSRN 0.7393 LapSRN 2.2271 LapSRN
• High-Frequency Error Norm (Normalized)
27.56 VDSR 0.7385 VDSR 2.1719 VDSR
2Τ
Deep Neural Net Loss • 𝐻𝐹𝐸𝑁𝑁 = 𝐿𝑜𝐺(𝑦ො − 𝑦) 𝑐𝑜𝑛𝑠𝑡 27.53 DRCN 0.7372 DRCN 2.1477 DRCN
𝑦
𝑥 8x PSNR 8x SSIM 8x HFENI**
𝐺(𝑥) 24.75 NVIDIA 0.6087 NVIDIA 0.8954 NVIDIA
LoG (Laplacian of Gaussian)
Deep learning approach exploits prior • Boosts reconstruction of high-frequency 24.62 LapSRN 0.5983 LapSRN 0.8173 LapSRN
knowledge and statistics, extracted from details 24.36 SCN 0.5844 SCN 0.7628 SCN
training images. • Composite loss: 𝑀𝑆𝐸 + 𝛼 ∗ 𝐻𝐹𝐸𝑁𝑁
24.29 VDSR 0.5823 VDSR 0.7519 VDSR 4X UPSCALING (GAN)
Super-resolution technology is released within * J-Net: following U-Net notation idea (Ronneberger et al.)
NVIDIA GameWorks Materials & Textures service ** Inversed HFENN, suitable for evaluation of high-frequency details
gwmt.nvidia.com *** Result images for other algorithms were taken from LapSRN work (Lai et al.)