0% found this document useful (0 votes)
21 views

6 - High Resolution Diffusive Model

The document discusses using artificial intelligence and diffusion modeling for high-resolution image generation. It provides an overview of diffusion models and compares different models for high-resolution image generation based on diffusion, including SR3, CDM, BigGAN, and Palette. It also provides resources and tasks for further studying diffusion methods and implementing alternative generative models.

Uploaded by

raquel.novel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

6 - High Resolution Diffusive Model

The document discusses using artificial intelligence and diffusion modeling for high-resolution image generation. It provides an overview of diffusion models and compares different models for high-resolution image generation based on diffusion, including SR3, CDM, BigGAN, and Palette. It also provides resources and tasks for further studying diffusion methods and implementing alternative generative models.

Uploaded by

raquel.novel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

High-resolution image generation using Artificial

Intelligence and diffusion modeling


Collaborative project

Overview

Until recently, image synthesis tasks were performed using deep generative models, such as GANs,
VAEs, and auto-regressive models. However, when these models were trained, they showed prob-
lems in synthesizing high quality samples on difficult and high resolution datasets. For example,
GANs have unstable training, while auto-regressive models generally suffer from slow synthesis
speed. In this regard, diffusion methods have recently played an important role for high-resolution
image and sound generation, since compared to the other generative methods (GANs, VAEs) they
have stable training making them very promising. Diffusion models work by corrupting the training
data, progressively adding Gaussian noise, slowly erasing details in the data until it becomes pure
noise and then training a neural network to reverse this corruption process. Running this reverse
corruption process synthesizes the data from the pure noise by gradually removing it until a clean
sample is produced. A comparison of different high-resolution image generation models based on
diffusion models is shown in figure 1.

Figure 1: Comparison of different high-resolution image generation based on diffusion models

Diffusion Models
Artificial Intelligence (AI) is gaining more and more ground in the world of imaging, from creating
similar photographs, generating deepfakes, coloring and scaling to higher resolution. Lately, Google
has employed its AI to convert pixelated photographs to high-resolution images, i.e., this machine
learning model is able to take a photo without resolution to scale it with the goal of getting as
much detail as possible. There are several methods to get a photo to scale thanks to Artificial
Intelligence. For example, the mechanisms employed by Google are called SR3 and CDM. diffusion
models. Below you can find a list of different diffusion models that generate high resolution images.

SR3 Super-Resolution Imaging via Repeated Refinement or also known as SR3 is a method that
takes a low resolution image as input and builds a high quality photograph out of a lot of noise.

B–1
The machine employs a process that constantly adds this defect until only this drawback is visible,
thus reversing the process.

CDM The CDM (cascade of multiple diffusion models) is a class conditional diffusion tool trained
on ImageNet data to generate high-resolution natural images. This mechanism starts with a stan-
dard model at the lowest quality and is followed by a sequence of models at high resolution where
details can be added to improve it. The result of this tool is to improve the photographs through a
direct application, and also serves to improve the resolution of those images taken with the mobile
camera.

Big-GAN You can find more information about this model in the document reference in [1].

Palette It is an Image-to-image diffusion model. It is a simple and general framework developed


for image-to-image translation called Palette. It was evaluated on four challenging computer vision
tasks, namely colorization, inpainting, uncropping, and JPEG restoration. Authors claim that
Palette is able outperform strong task-specific GANs without any task-specific customization or
hyper-parameter tuning. You can find in this site [2] information about Palette and the codes to
reproduct [3] the experiments.

Available Resources And Suplementary Material


• Implementation using SR3 method available in:
https://github.com/Janspiry/Image-Super-Resolution-via-Iterative-Refinement,
https://github.com/Scoles12/CPIAS ionisS coles.git, https://arxiv.org/pdf/2104.07636.pdf

• Datasets: Available in: https://paperswithcode.com/datasets

• Generative models description (e.g. GAN, VAE), available in:


https://paperswithcode.com/methods/category/generative-models. These are older
implementations and can be found easily in: https://paperswithcode.com/task/super-resolution.

• Other implemented methods, available in:


https://paperswithcode.com/methods
Palette: In-painting model.
Paper: https://arxiv.org/pdf/2111.05826.pdf.
Code: https://github.com/Janspiry/Palette-Image-to-Image-Diffusion-Models
Latent Diffusion Model:
Paper: https://arxiv.org/pdf/2112.10752.pdf
Code: https://github.com/CompVis/latent-diffusion
Cold Diffusion Model:
Paper: https://arxiv.org/pdf/2208.09392.pdf.
Code: https://github.com/arpitbansal297/Cold-Diffusion-Models
Denoising Diffusion Probabilistic Models:
Paper: https://arxiv.org/pdf/2006.11239.pdf.
Code: https://github.com/lucidrains/denoising-diffusion-pytorch

B–2
Adaptive feature modification layers:
Paper: https://arxiv.org/pdf/1904.08118.pdf.
Code: https://github.com/hejingwenhejingwen/AdaFM

Tasks:
• Study diffusion methods (not include SR3) for generating high-resolution images.

• To understand and execute the suggested implemented codes.

• Implement another alternative method also from the suggested ones (GANs, VAE, Big-GAN)
and evaluate the results for different images of the ImageNet dataset considering the metrics
indicated in the papers.

• Study the possibility of rescaling.

Supervisors
The project supervisors are:

• Beatriz Otero , Universitat Politècnica de Catalunya, beatriz.otero@upc.edu

• Gladys Utrera, Universitat Politècnica de Catalunya, gladys.utrera@upc.edu

References

[1] “Big-gan,” https://paperswithcode.com/paper/large-scale-gan-training-for-high-fidelity, On-


line; accessed 1st November 2022.

[2] “Palette,” https://openreview.net/pdf?id=FPGs276lUeq, Online; accessed 1st November 2022.

[3] “Palette source codes,” https://iterative-refinement.github.io/palette/, Online; accessed 1st


November 2022.

B–3

You might also like