Open navigation menu

Scribd

0% found this document useful (0 votes)

74 views139 pages

Autoencoders and Restricted Boltzmann Machines: Amir H. Payberah

The document discusses autoencoders and their use in learning efficient internal representations of input data without supervision. It begins with an example showing how recognizing patterns can help memorize sequences. It then explains that autoencoders, like expert chess players, are able to efficiently store information by learning patterns from inputs. The document outlines the key components of an autoencoder, including the encoder, decoder, and different types. It also discusses how stacked autoencoders can learn more complex representations through additional hidden layers.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

74 views139 pages

Autoencoders and Restricted Boltzmann Machines: Amir H. Payberah

The document discusses autoencoders and their use in learning efficient internal representations of input data without supervision. It begins with an example showing how recognizing patterns can help memorize sequences. It then explains that autoencoders, like expert chess players, are able to efficiently store information by learning patterns from inputs. The document outlines the key components of an autoencoder, including the encoder, decoder, and different types. It also discusses how stacked autoencoders can learn more complex representations through additional hidden layers.

Uploaded by

Copyright

© © All Rights Reserved

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 139

Autoencoders and Restricted Boltzmann Machines

Amir H. Payberah
payberah@kth.se
2020-10-22
Let’s Start With An Example

1 / 61
I Which of them is easier to memorize?

2 / 61
I Which of them is easier to memorize?

I Seq1: 40, 27, 25, 36, 81, 57, 10, 73, 19, 68

2 / 61
I Which of them is easier to memorize?

I Seq1: 40, 27, 25, 36, 81, 57, 10, 73, 19, 68

I Seq2: 50, 25, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20

2 / 61
Seq1 : 40, 27, 25, 36, 81, 57, 10, 73, 19, 68
Seq2 : 50, 25, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20

3 / 61
Seq1 : 40, 27, 25, 36, 81, 57, 10, 73, 19, 68
Seq2 : 50, 25, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20

I Seq1 is shorter, so it should be easier.

3 / 61
Seq1 : 40, 27, 25, 36, 81, 57, 10, 73, 19, 68
Seq2 : 50, 25, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20

I Seq1 is shorter, so it should be easier.

I But, Seq2 follows two simple rules:

3 / 61
Seq1 : 40, 27, 25, 36, 81, 57, 10, 73, 19, 68
Seq2 : 50, 25, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20

I Seq1 is shorter, so it should be easier.

I But, Seq2 follows two simple rules:
• Even numbers are followed by their half.
• Odd numbers are followed by their triple plus one.

3 / 61
Seq1 : 40, 27, 25, 36, 81, 57, 10, 73, 19, 68
Seq2 : 50, 25, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20

I Seq1 is shorter, so it should be easier.

I But, Seq2 follows two simple rules:
• Even numbers are followed by their half.
• Odd numbers are followed by their triple plus one.

I You don’t need pattern if you could quickly and easily

memorize very long sequences

3 / 61
Seq1 : 40, 27, 25, 36, 81, 57, 10, 73, 19, 68
Seq2 : 50, 25, 76, 38, 19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20

I Seq1 is shorter, so it should be easier.

I But, Seq2 follows two simple rules:
• Even numbers are followed by their half.
• Odd numbers are followed by their triple plus one.

I You don’t need pattern if you could quickly and easily

memorize very long sequences

I But, it is hard to memorize long sequences that makes it useful

to recognize patterns.

3 / 61
I 1970, W. Chase and H. Simon
I They observed that expert chess players were able to memorize the positions of all
the pieces in a game by looking at the board for just 5 seconds.

4 / 61
I This was only the case when the pieces were placed in realistic positions, not when
the pieces were placed randomly.

5 / 61
I This was only the case when the pieces were placed in realistic positions, not when
the pieces were placed randomly.

I Chess experts don’t have a much better memory than you and I.

5 / 61
I This was only the case when the pieces were placed in realistic positions, not when
the pieces were placed randomly.

I Chess experts don’t have a much better memory than you and I.

I They just see chess patterns more easily due to

their experience with the game.

5 / 61
I This was only the case when the pieces were placed in realistic positions, not when
the pieces were placed randomly.

I Chess experts don’t have a much better memory than you and I.

I They just see chess patterns more easily due to

their experience with the game.

I Patterns helps them store information efficiently.

5 / 61
Autoencoders

6 / 61
Autoencoders (1/5)

I Just like the chess players in this memory experiment.

7 / 61
Autoencoders (1/5)

I Just like the chess players in this memory experiment.

I An autoencoder looks at the inputs, converts them to an efficient internal represen-
tation, and then spits out something that looks very close to the inputs.

7 / 61
Autoencoders (2/5)

I The same architecture as a Multi-Layer Perceptron (MLP).

8 / 61
Autoencoders (2/5)

I The same architecture as a Multi-Layer Perceptron (MLP).

I Except that the number of neurons in the output layer must be equal to the number
of inputs.

8 / 61
Autoencoders (3/5)

I An autoencoder is always composed of two parts.

9 / 61
Autoencoders (3/5)

I An autoencoder is always composed of two parts.

I An encoder (recognition network), h = f(x)

Converts the inputs to an internal representation.

9 / 61
Autoencoders (3/5)

I An autoencoder is always composed of two parts.

I An encoder (recognition network), h = f(x)

Converts the inputs to an internal representation.

I A decoder (generative network), r = g(h)

Converts the internal representation to the outputs.

9 / 61
Autoencoders (3/5)

I An autoencoder is always composed of two parts.

I An encoder (recognition network), h = f(x)

Converts the inputs to an internal representation.

I A decoder (generative network), r = g(h)

Converts the internal representation to the outputs.

I If an autoencoder learns to set g(f(x)) = x everywhere,

it is not especially useful, why?

9 / 61
Autoencoders (4/5)

I Autoencoders are designed to be unable to learn to copy perfectly.

10 / 61
Autoencoders (4/5)

I Autoencoders are designed to be unable to learn to copy perfectly.

I The models are forced to prioritize which aspects of the input should be copied, they
often learn useful properties of the data.

10 / 61
Autoencoders (5/5)

I Autoencoders are neural networks capable of learning efficient representations of the

input data (called codings) without any supervision.

11 / 61
Autoencoders (5/5)

I Autoencoders are neural networks capable of learning efficient representations of the

input data (called codings) without any supervision.
I Dimension reduction: these codings typically have a much lower dimensionality than
the input data.

11 / 61
Different Types of Autoencoders

I Stacked autoencoders

I Denoising autoencoders

I Sparse autoencoders

I Variational autoencoders

12 / 61
Different Types of Autoencoders

I Stacked autoencoders

I Denoising autoencoders

I Sparse autoencoders

I Variational autoencoders

13 / 61
Stacked Autoencoders (1/3)

I Stacked autoencoder: autoencoders with multiple hidden layers.

14 / 61
Stacked Autoencoders (1/3)

I Stacked autoencoder: autoencoders with multiple hidden layers.

I Adding more layers helps the autoencoder learn more complex codings.

14 / 61
Stacked Autoencoders (1/3)

I Stacked autoencoder: autoencoders with multiple hidden layers.

I Adding more layers helps the autoencoder learn more complex codings.

I The architecture is typically symmetrical with regards to the central hidden layer.

14 / 61
Stacked Autoencoders (2/3)

I In a symmetric architecture, we can tie the weights of the decoder layers to the
weights of the encoder layers.

15 / 61
Stacked Autoencoders (2/3)

I In a symmetric architecture, we can tie the weights of the decoder layers to the
weights of the encoder layers.

I In a network with N layers, the decoder layer weights can be defined as wN−l+1 = wTl ,
with l = 1, 2, · · · , N2 .

15 / 61
Stacked Autoencoders (2/3)

I In a symmetric architecture, we can tie the weights of the decoder layers to the
weights of the encoder layers.

I In a network with N layers, the decoder layer weights can be defined as wN−l+1 = wTl ,
with l = 1, 2, · · · , N2 .

I This halves the number of weights in the model, speeding up training and limiting
the risk of overfitting.

15 / 61
Stacked Autoencoders (3/3)

stacked_encoder = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(30, activation="relu"),
])
stacked_decoder = keras.models.Sequential([
keras.layers.Dense(100, activation="relu", input_shape=[30]),
keras.layers.Dense(28 * 28, activation="sigmoid"),
keras.layers.Reshape([28, 28])
])

model = keras.models.Sequential([stacked_encoder, stacked_decoder])

16 / 61
Different Types of Autoencoders

I Stacked autoencoders

I Denoising autoencoders

I Sparse autoencoders

I Variational autoencoders

17 / 61
Denoising Autoencoders (1/4)

I One way to force the autoencoder to learn useful features is to add noise to its inputs,
training it to recover the original noise-free inputs.

18 / 61
Denoising Autoencoders (1/4)

I One way to force the autoencoder to learn useful features is to add noise to its inputs,
training it to recover the original noise-free inputs.

I This prevents the autoencoder from trivially copying its inputs to its outputs, so it
ends up having to find patterns in the data.

18 / 61
Denoising Autoencoders (2/4)

I The noise can be pure Gaussian noise added to the inputs, or it can be randomly
switched off inputs, just like in dropout.

19 / 61
Denoising Autoencoders (3/4)

denoising_encoder = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dropout(0.5),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(30, activation="relu")
])
denoising_decoder = keras.models.Sequential([
keras.layers.Dense(100, activation="relu", input_shape=[30]),
keras.layers.Dense(28 * 28, activation="sigmoid"),
keras.layers.Reshape([28, 28])
])

model = keras.models.Sequential([denoising_encoder, denoising_decoder])

20 / 61
Denoising Autoencoders (4/4)

denoising_encoder = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.GaussianNoise(0.2),
keras.layers.Dense(100, activation="relu"),
keras.layers.Dense(30, activation="relu")
])
denoising_decoder = keras.models.Sequential([
keras.layers.Dense(100, activation="relu", input_shape=[30]),
keras.layers.Dense(28 * 28, activation="sigmoid"),
keras.layers.Reshape([28, 28])
])

model = keras.models.Sequential([denoising_encoder, denoising_decoder])

21 / 61
Different Types of Autoencoders

I Stacked autoencoders

I Denoising autoencoders

I Sparse autoencoders

I Variational autoencoders

22 / 61
Sparse Autoencoders (1/2)

I Adding an appropriate term to the cost function to push the autoencoder to reducing
the number of active neurons in the coding layer.

23 / 61
Sparse Autoencoders (1/2)

I Adding an appropriate term to the cost function to push the autoencoder to reducing
the number of active neurons in the coding layer.

I This forces the autoencoder to represent each input as a combination of a small

number of activations.

23 / 61
Sparse Autoencoders (1/2)

I Adding an appropriate term to the cost function to push the autoencoder to reducing
the number of active neurons in the coding layer.

I This forces the autoencoder to represent each input as a combination of a small

number of activations.

I As a result, each neuron in the coding layer typically ends up representing a useful
feature.

23 / 61
Sparse Autoencoders (2/2)

sparse_l1_encoder = keras.models.Sequential([
keras.layers.Flatten(input_shape=[28, 28]),
keras.layers.Dense(100, activation="selu"),
keras.layers.Dense(300, activation="sigmoid", activity_regularizer=keras.regularizers.l1(1e-3))
])

sparse_l1_decoder = keras.models.Sequential([
keras.layers.Dense(100, activation="selu", input_shape=[300]),
keras.layers.Dense(28 * 28, activation="sigmoid"),
keras.layers.Reshape([28, 28])
])

model = keras.models.Sequential([sparse_l1_encoder, sparse_l1_decoder])

24 / 61
Different Types of Autoencoders

I Stacked autoencoders

I Denoising autoencoders

I Sparse autoencoders

I Variational autoencoders

25 / 61
Variational Autoencoders (1/6)

I Variational autoencoders are probabilistic autoencoders.

26 / 61
Variational Autoencoders (1/6)

I Variational autoencoders are probabilistic autoencoders.

I Their outputs are partly determined by chance, even after training.

• As opposed to denoising autoencoders, which use randomness only during training.

26 / 61
Variational Autoencoders (1/6)

I Variational autoencoders are probabilistic autoencoders.

I Their outputs are partly determined by chance, even after training.

• As opposed to denoising autoencoders, which use randomness only during training.

I They are generative autoencoders, meaning that they can generate new instances
that look like they were sampled from the training set.

26 / 61
Variational Autoencoders (2/6)

I Instead of directly producing a coding for a given input, the encoder produces a mean
coding µ and a standard deviation σ.

27 / 61
Variational Autoencoders (2/6)

I Instead of directly producing a coding for a given input, the encoder produces a mean
coding µ and a standard deviation σ.

I The actual coding is then sampled randomly from a Gaussian distribution with mean
µ and standard deviation σ.

27 / 61
Variational Autoencoders (2/6)

I Instead of directly producing a coding for a given input, the encoder produces a mean
coding µ and a standard deviation σ.

I The actual coding is then sampled randomly from a Gaussian distribution with mean
µ and standard deviation σ.

I After that the decoder just decodes the

sampled coding normally.

27 / 61
Variational Autoencoders (3/6)

I The cost function is composed of two parts.

28 / 61
Variational Autoencoders (3/6)

I The cost function is composed of two parts.

I 1. the usual reconstruction loss.
• Pushes the autoencoder to reproduce its inputs.
• Using cross-entropy.

28 / 61
Variational Autoencoders (3/6)

I The cost function is composed of two parts.

I 1. the usual reconstruction loss.
• Pushes the autoencoder to reproduce its inputs.
• Using cross-entropy.

I 2. the latent loss

• Pushes the autoencoder to have codings that look as though they were sampled from
a simple Gaussian distribution.

28 / 61
Variational Autoencoders (3/6)

I The cost function is composed of two parts.

I 1. the usual reconstruction loss.
• Pushes the autoencoder to reproduce its inputs.
• Using cross-entropy.

I 2. the latent loss

• Pushes the autoencoder to have codings that look as though they were sampled from
a simple Gaussian distribution.
• Using the KL divergence between the target distribution (the Gaussian distribution) and
the actual distribution of the codings.

28 / 61
Variational Autoencoders (3/6)

I The cost function is composed of two parts.

I 1. the usual reconstruction loss.
• Pushes the autoencoder to reproduce its inputs.
• Using cross-entropy.

I 2. the latent loss

• Pushes the autoencoder to have codings that look as though they were sampled from
a simple Gaussian distribution.
• Using the KL divergence between the target distribution (the Gaussian distribution) and
the actual distribution
PK of the codings.
• latent loss = − 12 1 (1 + log (σi2 ) − σi2 − µ2i )

28 / 61
Variational Autoencoders (4/6)

I Encoder part

inputs = keras.layers.Input(shape=[28, 28])

z = keras.layers.Flatten()(inputs)
z = keras.layers.Dense(150, activation="relu")(z)
z = keras.layers.Dense(100, activation="relu")(z)
codings_mean = keras.layers.Dense(10)(z)
codings_log_var = keras.layers.Dense(10)(z)
codings = Sampling()([codings_mean, codings_log_var]) # normal distribution
variational_encoder = keras.models.Model(inputs=[inputs], outputs=[codings])

29 / 61
Variational Autoencoders (5/6)

I Decoder part

decoder_inputs = keras.layers.Input(shape=[codings_size])
x = keras.layers.Dense(100, activation="relu")(decoder_inputs)
x = keras.layers.Dense(150, activation="relu")(x)
x = keras.layers.Dense(28 * 28, activation="sigmoid")(x)
outputs = keras.layers.Reshape([28, 28])(x)
variational_decoder = keras.models.Model(inputs=[decoder_inputs], outputs=[outputs])

30 / 61
Variational Autoencoders (6/6)

codings = variational_encoder(inputs)
reconstructions = variational_decoder(codings)
model = keras.models.Model(inputs=[inputs], outputs=[reconstructions])

latent_loss = -0.5 * K.sum(1 + codings_log_var - K.exp(codings_log_var)

- K.square(codings_mean), axis=-1)
model.add_loss(K.mean(latent_loss) / 784.)

31 / 61
32 / 61
Restricted Boltzmann Machines

33 / 61
Restricted Boltzmann Machines

I A Restricted Boltzmann Machine (RBM) is a stochastic neural network.

34 / 61
Restricted Boltzmann Machines

I A Restricted Boltzmann Machine (RBM) is a stochastic neural network.

I Stochastic meaning these activations have a probabilistic element, instead of deter-

ministic functions, e.g., logistic or ReLU.

34 / 61
Restricted Boltzmann Machines

I A Restricted Boltzmann Machine (RBM) is a stochastic neural network.

I Stochastic meaning these activations have a probabilistic element, instead of deter-

ministic functions, e.g., logistic or ReLU.

I The neurons form a bipartite graph:

• One visible layer and one hidden layer.
• A symmetric connection between the two layers.
• There are no connections between neurons within
a layer.

34 / 61
Let’s Start With An Example

35 / 61
RBM Example (1/11)

I We have a set of six movies, and we ask users to tell us which ones they want to
watch.

36 / 61
RBM Example (1/11)

I We have a set of six movies, and we ask users to tell us which ones they want to
watch.
I We want to learn two latent neurons (hidden neurons) underlying movie preferences,
e.g., SF/fantasy and Oscar winners

36 / 61
RBM Example (2/11)

I Our RBM would look like the following.

37 / 61
RBM Example (3/11)

I Alice: (HP=1, Avatar=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.

38 / 61
RBM Example (3/11)

I Alice: (HP=1, Avatar=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.

I Bob: (HP=1, Avatar=0, LOTR=1, Glad=0, Titan=0, Sep=0), SF fan, but not Avatar.

38 / 61
RBM Example (3/11)

I Alice: (HP=1, Avatar=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.

I Bob: (HP=1, Avatar=0, LOTR=1, Glad=0, Titan=0, Sep=0), SF fan, but not Avatar.
I Carol: (HP=1, Avat=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.

38 / 61
RBM Example (3/11)

I Alice: (HP=1, Avatar=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.

I Bob: (HP=1, Avatar=0, LOTR=1, Glad=0, Titan=0, Sep=0), SF fan, but not Avatar.
I Carol: (HP=1, Avat=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.
I David: (HP=0, Avat= 0, LOTR=1, Glad=1, Titan=1, Sep=1), Big Oscar winners fan.

38 / 61
RBM Example (3/11)

I Alice: (HP=1, Avatar=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.

I Bob: (HP=1, Avatar=0, LOTR=1, Glad=0, Titan=0, Sep=0), SF fan, but not Avatar.
I Carol: (HP=1, Avat=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.
I David: (HP=0, Avat= 0, LOTR=1, Glad=1, Titan=1, Sep=1), Big Oscar winners fan.
I Eric: (HP=0, Avat=0, LOTR=1, Glad=1, Titan=0, Sep=1), Oscar winners fan, but not Titanic.

38 / 61
RBM Example (3/11)

I Alice: (HP=1, Avatar=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.

I Bob: (HP=1, Avatar=0, LOTR=1, Glad=0, Titan=0, Sep=0), SF fan, but not Avatar.
I Carol: (HP=1, Avat=1, LOTR=1, Glad=0, Titan=0, Sep=0), Big SF fan.
I David: (HP=0, Avat= 0, LOTR=1, Glad=1, Titan=1, Sep=1), Big Oscar winners fan.
I Eric: (HP=0, Avat=0, LOTR=1, Glad=1, Titan=0, Sep=1), Oscar winners fan, but not Titanic.
I Fred: (HP=0, Avat=0, LOTR=1, Glad=1, Titan=1, Sep=1), Big Oscar winners fan.

38 / 61
RBM Example (4/11)

I Assume the given input xi is the 0 or 1 for each visible neuron vi .

• 1: like a movie, and 0: dislike a movie

39 / 61
RBM Example (4/11)

I Assume the given input xi is the 0 or 1 for each visible neuron vi .

• 1: like a movie, and 0: dislike a movie

I Compute the activation energy at hidden neuron hj :

X
a(hj ) = wij vi
i

39 / 61
RBM Example (5/11)

I For each hidden neuron hj , we compute the probability p(hj ).

X
a(hj ) = wij vi
i
1
p(hj ) = sigmoid(a(hj )) =
1 + e−a(hj )

40 / 61
RBM Example (5/11)

I For each hidden neuron hj , we compute the probability p(hj ).

X
a(hj ) = wij vi
i
1
p(hj ) = sigmoid(a(hj )) =
1 + e−a(hj )
I We turn on the hidden neuron hj with the probability p(hj ), and turn it off with
probability 1 − p(hj ).

40 / 61
RBM Example (6/11)

I Declaring that you like Harry Potter, Avatar, and LOTR, doesn’t guarantee that the
SF/fantasy hidden neuron will turn on.

41 / 61
RBM Example (6/11)

I Declaring that you like Harry Potter, Avatar, and LOTR, doesn’t guarantee that the
SF/fantasy hidden neuron will turn on.
I But it will turn on with a high probability.

41 / 61
RBM Example (6/11)

I Declaring that you like Harry Potter, Avatar, and LOTR, doesn’t guarantee that the
SF/fantasy hidden neuron will turn on.
I But it will turn on with a high probability.
• In reality, if you want to watch all three of those movies makes us highly suspect you
like SF/fantasy in general.
• But there’s a small chance you like them for other reasons.

41 / 61
RBM Example (7/11)

I Conversely, if we know that one person likes SF/fantasy (so that the SF/fantasy
neuron is on)

42 / 61
RBM Example (7/11)

I Conversely, if we know that one person likes SF/fantasy (so that the SF/fantasy
neuron is on)
I We can ask the RBM to generate a set of movie recommendations.

42 / 61
RBM Example (7/11)

I Conversely, if we know that one person likes SF/fantasy (so that the SF/fantasy
neuron is on)
I We can ask the RBM to generate a set of movie recommendations.
I The hidden neurons send messages to the visible (movie) neurons, telling them to
update their states. X
a(vi ) = wij hj
j
1
p(vi ) = sigmoid(a(vi )) =
1 + e−a(vi )

42 / 61
RBM Example (7/11)

I Conversely, if we know that one person likes SF/fantasy (so that the SF/fantasy
neuron is on)
I We can ask the RBM to generate a set of movie recommendations.
I The hidden neurons send messages to the visible (movie) neurons, telling them to
update their states. X
a(vi ) = wij hj
j
1
p(vi ) = sigmoid(a(vi )) =
1 + e−a(vi )
I Being on the SF/fantasy neuron doesn’t guarantee that we’ll always recommend all
three of Harry Potter, Avatar, and LOTR.

42 / 61
RBM Example (7/11)

I Conversely, if we know that one person likes SF/fantasy (so that the SF/fantasy
neuron is on)
I We can ask the RBM to generate a set of movie recommendations.
I The hidden neurons send messages to the visible (movie) neurons, telling them to
update their states. X
a(vi ) = wij hj
j
1
p(vi ) = sigmoid(a(vi )) =
1 + e−a(vi )
I Being on the SF/fantasy neuron doesn’t guarantee that we’ll always recommend all
three of Harry Potter, Avatar, and LOTR.
• For example not everyone who likes science fiction liked Avatar.

42 / 61
RBM Example (8/11)

I How do we learn the connection weights wij in our network?

43 / 61
RBM Example (8/11)

I How do we learn the connection weights wij in our network?

I Assume, as an input we have a bunch of binary vectors x with six elements corre-
sponding to a user’s movie preferences.

43 / 61
RBM Example (8/11)

I How do we learn the connection weights wij in our network?

I Assume, as an input we have a bunch of binary vectors x with six elements corre-
sponding to a user’s movie preferences.
I We do the following steps in each epoch:

43 / 61
RBM Example (8/11)

I How do we learn the connection weights wij in our network?

I Assume, as an input we have a bunch of binary vectors x with six elements corre-
sponding to a user’s movie preferences.
I We do the following steps in each epoch:
I 1. Take a training instance x and set the states of the visible neurons to these
preferences.

43 / 61
RBM Example (9/11)

I 2. Update the states of the hidden neurons.

44 / 61
RBM Example (9/11)

I 2. Update the states of the hidden neurons.

P
• Compute a(hj ) = i wij vi for each hidden neuron hj .

44 / 61
RBM Example (9/11)

I 2. Update the states of the hidden neurons.

P
• Compute a(hj ) = i wij vi for each hidden neuron hj .
1
• Set hj to 1 with probability p(hj ) = sigmoid(a(hj )) =
1+e−a(hj )

44 / 61
RBM Example (9/11)

I 2. Update the states of the hidden neurons.

P
• Compute a(hj ) = i wij vi for each hidden neuron hj .
1
• Set hj to 1 with probability p(hj ) = sigmoid(a(hj )) =
1+e−a(hj )

I 3. For each edge eij , compute positive(eij ) = vi × hj

• I.e., for each pair of neurons, measure whether they are both on.

44 / 61
RBM Example (10/11)

I 4. Update the state of the visible neurons in a similar manner.

45 / 61
RBM Example (10/11)

I 4. Update the state of the visible neurons in a similar manner.

0
• We denote the updated
P visible neurons with vi .
• Compute a(vi ) = j wij hj for each visible neuron v0i .
0

45 / 61
RBM Example (10/11)

I 4. Update the state of the visible neurons in a similar manner.

0
• We denote the updated
P visible neurons with vi .
• Compute a(vi ) = j wij hj for each visible neuron v0i .
0

• Set v0i to 1 with probability p(v0i ) = sigmoid(a(v0i )) = 1

0
1+e−a(vi )

45 / 61
RBM Example (10/11)

I 4. Update the state of the visible neurons in a similar manner.

0
• We denote the updated
P visible neurons with vi .
• Compute a(vi ) = j wij hj for each visible neuron v0i .
0

• Set v0i to 1 with probability p(v0i ) = sigmoid(a(v0i )) = 1

0
1+e−a(vi )
I 5. Update the hidden neurons again similar to step 2. We denote the updated hidden
neurons with h0j .

45 / 61
RBM Example (10/11)

I 4. Update the state of the visible neurons in a similar manner.

0
• We denote the updated
P visible neurons with vi .
• Compute a(vi ) = j wij hj for each visible neuron v0i .
0

• Set v0i to 1 with probability p(v0i ) = sigmoid(a(v0i )) = 1

0
1+e−a(vi )
I 5. Update the hidden neurons again similar to step 2. We denote the updated hidden
neurons with h0j .
I 6. For each edge eij , compute negative(eij ) = v0i × h0j

45 / 61
RBM Example (11/11)

I 7. Update the weight of each edge eij .

wij = wij + η(positive(eij ) − negative(eij ))

46 / 61
RBM Example (11/11)

I 7. Update the weight of each edge eij .

wij = wij + η(positive(eij ) − negative(eij ))

I 8. Repeat over all training examples.

46 / 61
RBM Example (11/11)

I 7. Update the weight of each edge eij .

wij = wij + η(positive(eij ) − negative(eij ))

I 8. Repeat over all training examples.

I 9. Continue until the error between the training examples and their reconstructions
falls below some threshold or we reach some maximum number of epochs.

46 / 61
RBM Training (1/2)

I Step 1, Gibbs sampling: what we have done in steps 1-6.

47 / 61
RBM Training (1/2)

I Step 1, Gibbs sampling: what we have done in steps 1-6.

I Given an input vector v, compute p(h|v).

47 / 61
RBM Training (1/2)

I Step 1, Gibbs sampling: what we have done in steps 1-6.

I Given an input vector v, compute p(h|v).

I Knowing the hidden values h, we use p(v|h) for prediction of new input values v.

47 / 61
RBM Training (1/2)

I Step 1, Gibbs sampling: what we have done in steps 1-6.

I Given an input vector v, compute p(h|v).

I Knowing the hidden values h, we use p(v|h) for prediction of new input values v.

I This process is repeated k times.

47 / 61
RBM Training (2/2)

I Step 2, contrastive divergence: what we have done in step 7.

• Just a fancy name for approximate gradient descent.

w = w + η(positive(e) − negative(e))

48 / 61
More Details about RBM

49 / 61
Energy-based Model (1/3)

I Energy a quantitative property of physics.

50 / 61
Energy-based Model (1/3)

I Energy a quantitative property of physics.

• E.g., gravitational energy describes the potential energy a body with mass has in
relation to another massive object due to gravity.

50 / 61
Energy-based Model (2/3)

I One purpose of deep learning models is to encode dependencies between variables.

51 / 61
Energy-based Model (2/3)

I One purpose of deep learning models is to encode dependencies between variables.

I The capturing of dependencies happen through associating of a scalar energy to each
state of the variables.
• Serves as a measure of compatibility.

51 / 61
Energy-based Model (2/3)

I One purpose of deep learning models is to encode dependencies between variables.

I The capturing of dependencies happen through associating of a scalar energy to each
state of the variables.
• Serves as a measure of compatibility.
I A high energy means a bad compatibility.

51 / 61
Energy-based Model (2/3)

I One purpose of deep learning models is to encode dependencies between variables.

I The capturing of dependencies happen through associating of a scalar energy to each
state of the variables.
• Serves as a measure of compatibility.
I A high energy means a bad compatibility.
I An energy based model tries always to minimize a predefined energy function.

51 / 61
Energy-based Model (3/3)

I The energy function for the RBMs is defined as:

X X X
E(v, h) = −( wij vi hj + bi vi + cj hj )
ij i j

52 / 61
Energy-based Model (3/3)

I The energy function for the RBMs is defined as:

X X X
E(v, h) = −( wij vi hj + bi vi + cj hj )
ij i j

I v and h represent the visible and hidden units, respectively.

52 / 61
Energy-based Model (3/3)

I The energy function for the RBMs is defined as:

X X X
E(v, h) = −( wij vi hj + bi vi + cj hj )
ij i j

I v and h represent the visible and hidden units, respectively.

I w represents the weights connecting visible and hidden units.

52 / 61
Energy-based Model (3/3)

I The energy function for the RBMs is defined as:

X X X
E(v, h) = −( wij vi hj + bi vi + cj hj )
ij i j

I v and h represent the visible and hidden units, respectively.

I w represents the weights connecting visible and hidden units.
I b and c are the biases of the visible and hidden layers, respectively.

52 / 61
RBM is a Probabilistic Model (1/2)

I The probability of a certain state of v and h:

e−E(v,h)
p(v, h) = P −E(v,h)
v,h e

53 / 61
RBM is a Probabilistic Model (1/2)

I The probability of a certain state of v and h:

e−E(v,h)
p(v, h) = P −E(v,h)
v,h e

I In physics, the joint distribution p(v, h) is known as the Boltzmann Distribution or

Gibbs Distribution.

53 / 61
RBM is a Probabilistic Model (1/2)

I The probability of a certain state of v and h:

e−E(v,h)
p(v, h) = P −E(v,h)
v,h e

I In physics, the joint distribution p(v, h) is known as the Boltzmann Distribution or

Gibbs Distribution.
I At each point in time the RBM is in a certain state.
• The state refers to the values of neurons in the visible and hidden layers v and h.

53 / 61
RBM is a Probabilistic Model (2/2)

I It is difficult to calculate the joint probability due to the huge number of possible
combination of v and h.
e−E(v,h)
p(v, h) = P −E(v,h)
v,h e

54 / 61
RBM is a Probabilistic Model (2/2)

I It is difficult to calculate the joint probability due to the huge number of possible
combination of v and h.
e−E(v,h)
p(v, h) = P −E(v,h)
v,h e

I Much easier is the calculation of the conditional probabilities of state h given the
state v and vice versa (Gibbs sampling)
p(h|v) = Πi p(hi |v)
p(v|h) = Πj p(vj |h)

54 / 61
Learning in Boltzmann Machines (1/2)

I RBMs try to learn a probability distribution from the data they are given.

55 / 61
Learning in Boltzmann Machines (1/2)

I RBMs try to learn a probability distribution from the data they are given.
I Given a training set of state vectors v, learning consists of finding parameters w of
p(v, h), in a way that the training vectors have high probability p(v).
P −E(v,h)
e
p(v|h) = P h −E(v,h)
v,h e

55 / 61
Learning in Boltzmann Machines (1/2)

I RBMs try to learn a probability distribution from the data they are given.
I Given a training set of state vectors v, learning consists of finding parameters w of
p(v, h), in a way that the training vectors have high probability p(v).
P −E(v,h)
e
p(v|h) = P h −E(v,h)
v,h e

I Use the maximum-likelihood estimation.

55 / 61
Learning in Boltzmann Machines (1/2)

I RBMs try to learn a probability distribution from the data they are given.
I Given a training set of state vectors v, learning consists of finding parameters w of
p(v, h), in a way that the training vectors have high probability p(v).
P −E(v,h)
e
p(v|h) = P h −E(v,h)
v,h e

I Use the maximum-likelihood estimation.

I For a model of the form p(v) with parameters w, the log-likelihood given a single
training example v is:
P −E(v,h)
e X X
log p(v|h) = log P h −E(v,h) = log e−E(v,h) − log e−E(v,h)
v,h e h v,h

55 / 61
Learning in Boltzmann Machines (2/2)

I The log-likelihood gradients for an RBM with binary units:

∂ log p(v|h)
= positive(eij ) − negative(eij )
∂wij

56 / 61
Learning in Boltzmann Machines (2/2)

I The log-likelihood gradients for an RBM with binary units:

∂ log p(v|h)
= positive(eij ) − negative(eij )
∂wij

I Then, we can update the weight w as follows:

(next)
wij = wij + η(positive(eij ) − negative(eij ))

56 / 61
57 / 61
Summary

58 / 61
Summary

I Autoencoders
• Stacked autoencoders
• Denoising autoencoders
• Variational autoencoders

I Restricted Boltzmann Machine

• Gibbs sampling
• Contrastive divergence

59 / 61
Reference

I Ian Goodfellow et al., Deep Learning (Ch. 14, 20)

I Aurélien Géron, Hands-On Machine Learning (Ch. 17)

60 / 61
Questions?

61 / 61

You might also like

2nd Exam Question Paper 2
No ratings yet
2nd Exam Question Paper 2
16 pages
Swarm Robotics: Fundamentals and Applications
From Everand
Swarm Robotics: Fundamentals and Applications
Fouad Sabry
No ratings yet
Autoencoders - Presentation
No ratings yet
Autoencoders - Presentation
18 pages
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
From Everand
Hopfield Networks: Fundamentals and Applications of The Neural Network That Stores Memories
Fouad Sabry
No ratings yet
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
From Everand
Hebbian Learning: Fundamentals and Applications for Uniting Memory and Learning
Fouad Sabry
No ratings yet
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
From Everand
Backpropagation: Fundamentals and Applications for Preparing Data for Training in Deep Learning
Fouad Sabry
No ratings yet
Numerical Methods for Two-Point Boundary-Value Problems
From Everand
Numerical Methods for Two-Point Boundary-Value Problems
Herbert B. Keller
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
From Everand
Connectivity Prediction in Mobile Ad Hoc Networks for Real-Time Control
Sebastian Thelen
5/5 (1)
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Elementary Theory and Application of Numerical Analysis: Revised Edition
From Everand
Elementary Theory and Application of Numerical Analysis: Revised Edition
David G. Moursund
No ratings yet
Dynamic programming The Ultimate Step-By-Step Guide
From Everand
Dynamic programming The Ultimate Step-By-Step Guide
Gerardus Blokdyk
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Solutions Manual to accompany An Introduction to Numerical Methods and Analysis
From Everand
Solutions Manual to accompany An Introduction to Numerical Methods and Analysis
James F. Epperson
5/5 (1)
Pneumatics: Driving Precision and Power in Robotics Science
From Everand
Pneumatics: Driving Precision and Power in Robotics Science
Fouad Sabry
No ratings yet
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
4.5/5 (2)
“Exploring Computer Systems: From Fundamentals to Advanced Concepts”: GoodMan, #1
From Everand
“Exploring Computer Systems: From Fundamentals to Advanced Concepts”: GoodMan, #1
Patrick Mukosha
No ratings yet
Computer Vision for the Web: Unleash the power of the Computer Vision algorithms in JavaScript to develop vision-enabled web content
From Everand
Computer Vision for the Web: Unleash the power of the Computer Vision algorithms in JavaScript to develop vision-enabled web content
Foat Akhmadeev
No ratings yet
Introduction to Python Programming: Learn Coding with Hands-On Projects for Beginners
From Everand
Introduction to Python Programming: Learn Coding with Hands-On Projects for Beginners
Kiet Huynh
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Classical Approach to Constrained and Unconstrained Molecular Dynamics
From Everand
Classical Approach to Constrained and Unconstrained Molecular Dynamics
Ajith Gunaratne
No ratings yet
Graphs and Tables of the Mathieu Functions and Their First Derivatives
From Everand
Graphs and Tables of the Mathieu Functions and Their First Derivatives
James C. Wiltse
No ratings yet
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
From Everand
Teaching and Learning in STEM With Computation, Modeling, and Simulation Practices: A Guide for Practitioners and Researchers
Alejandra J. Magana
No ratings yet
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
From Everand
Group Method of Data Handling: Fundamentals and Applications for Predictive Modeling and Data Analysis
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: NAIVE BAYES, NEAREST NEIGHBORS and NEURAL NETWORKS: Examples with MATLAB
César Pérez López
No ratings yet
Finite-state machine A Complete Guide
From Everand
Finite-state machine A Complete Guide
Gerardus Blokdyk
No ratings yet
Autoencoder GAN Edited
No ratings yet
Autoencoder GAN Edited
138 pages
35-Gated RNNs - Optimization For Long-Term Dependencies - Explicit Memory-07!10!2024
No ratings yet
35-Gated RNNs - Optimization For Long-Term Dependencies - Explicit Memory-07!10!2024
3 pages
Lecture 6373 07
No ratings yet
Lecture 6373 07
53 pages
DL Unit 5
No ratings yet
DL Unit 5
19 pages
Introduction To Autoencoders: A Brief Overview
No ratings yet
Introduction To Autoencoders: A Brief Overview
27 pages
Autoencoders
No ratings yet
Autoencoders
14 pages
MODULE 5 Auto-Encoders and Generative Models
No ratings yet
MODULE 5 Auto-Encoders and Generative Models
25 pages
Unit5 Autoencoders
No ratings yet
Unit5 Autoencoders
45 pages
Deep Learning: Autoencoder
No ratings yet
Deep Learning: Autoencoder
42 pages
DUnit IV
No ratings yet
DUnit IV
22 pages
Module 03
No ratings yet
Module 03
13 pages
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
No ratings yet
Jntuk r20 Unit-V Deep Learning Techniques (WWW - Jntumaterials.co - In)
61 pages
Unit-5 Auto Encoders in Deep Learning
No ratings yet
Unit-5 Auto Encoders in Deep Learning
23 pages
Unit 5e - Autoencoders
No ratings yet
Unit 5e - Autoencoders
32 pages
Deep Learning Autoencoders
No ratings yet
Deep Learning Autoencoders
31 pages
Autoencoders
No ratings yet
Autoencoders
12 pages
Lecture 23b Auto Encoder
No ratings yet
Lecture 23b Auto Encoder
27 pages
Autoencoders: Presented By: 2019220013 Balde Lansana (
No ratings yet
Autoencoders: Presented By: 2019220013 Balde Lansana (
21 pages
D5 PPT
No ratings yet
D5 PPT
79 pages
Deep Learning Subject Practicals Uni Mumbai
No ratings yet
Deep Learning Subject Practicals Uni Mumbai
11 pages
Gen AI Unit 2
100% (1)
Gen AI Unit 2
65 pages
Autoencoders - Bits and Bytes of Deep Learning - Towards Data Science
No ratings yet
Autoencoders - Bits and Bytes of Deep Learning - Towards Data Science
10 pages
ch14 Autoencoder
No ratings yet
ch14 Autoencoder
42 pages
Autoencoders
No ratings yet
Autoencoders
20 pages
03 Autoencoders 4
No ratings yet
03 Autoencoders 4
159 pages
659451a19 DL Exp5
No ratings yet
659451a19 DL Exp5
8 pages
Autoencoders U
No ratings yet
Autoencoders U
44 pages
1 Autoencoders
No ratings yet
1 Autoencoders
22 pages
Vae Gan
No ratings yet
Vae Gan
214 pages
L23 Autoencoders
No ratings yet
L23 Autoencoders
16 pages
Unit 3
No ratings yet
Unit 3
23 pages
Lec16 - Autoencoders
No ratings yet
Lec16 - Autoencoders
18 pages
Deep Learning: Prof:Naveen Ghorpade
No ratings yet
Deep Learning: Prof:Naveen Ghorpade
43 pages
Autoencoders, Unsupervised Learning, and Deep Architectures
No ratings yet
Autoencoders, Unsupervised Learning, and Deep Architectures
14 pages
4.4 Linux File System
No ratings yet
4.4 Linux File System
20 pages
Lectures Biological Methods
No ratings yet
Lectures Biological Methods
44 pages
4.7 Linux System Calls
No ratings yet
4.7 Linux System Calls
23 pages
SMIL - Multimodal Learning With Severely Missing Modality 2021
No ratings yet
SMIL - Multimodal Learning With Severely Missing Modality 2021
9 pages
4.5 Process and Thread Concept
No ratings yet
4.5 Process and Thread Concept
10 pages
4.6 Creating and Operating POSIX Threads
No ratings yet
4.6 Creating and Operating POSIX Threads
12 pages
Recurrent Neural Networks: Amir H. Payberah
No ratings yet
Recurrent Neural Networks: Amir H. Payberah
142 pages
CMC Internal Assignment
No ratings yet
CMC Internal Assignment
1 page
Assgn 1
No ratings yet
Assgn 1
1 page
Secure Vehicular Communications Through Reconfigurable Intelligent Surfaces
No ratings yet
Secure Vehicular Communications Through Reconfigurable Intelligent Surfaces
6 pages
009-2003-080 TL1 Manual
No ratings yet
009-2003-080 TL1 Manual
308 pages
Software Engineering, 8th Ed PDF
No ratings yet
Software Engineering, 8th Ed PDF
865 pages
"SIM Cloning": Submitted To: - Mr. Gurbakash Phonsa
No ratings yet
"SIM Cloning": Submitted To: - Mr. Gurbakash Phonsa
6 pages
DSFC Osmaf01 Auxiliarysystem Svti
No ratings yet
DSFC Osmaf01 Auxiliarysystem Svti
426 pages
ArcGIS 10.4.1 Manual CMB
No ratings yet
ArcGIS 10.4.1 Manual CMB
88 pages
Sample Evaluation Plan: Quantitative (Counts of Things) and Qualitative (Narration of Things) - Our Quantitative Data
No ratings yet
Sample Evaluation Plan: Quantitative (Counts of Things) and Qualitative (Narration of Things) - Our Quantitative Data
3 pages
SE ZC323 Course Handout
No ratings yet
SE ZC323 Course Handout
9 pages
Genesis Tutorial Part I
No ratings yet
Genesis Tutorial Part I
5 pages
Using DNP3 & IEC 60870-5 Communication Protocols in The Oil & Gas Industry
No ratings yet
Using DNP3 & IEC 60870-5 Communication Protocols in The Oil & Gas Industry
4 pages
NCHRP RPT 574
No ratings yet
NCHRP RPT 574
290 pages
Ctds Franchise Application Form
No ratings yet
Ctds Franchise Application Form
2 pages
C. Worms and Virus A. Adware
No ratings yet
C. Worms and Virus A. Adware
2 pages
Arrays MR Long Student Guide
No ratings yet
Arrays MR Long Student Guide
5 pages
FAN5333A/FAN5333B High Efficiency, High Current Serial LED Driver With 30V Integrated Switch
No ratings yet
FAN5333A/FAN5333B High Efficiency, High Current Serial LED Driver With 30V Integrated Switch
11 pages
Experiment No. 1: Objective: Write A MATLAB Program To Generate An Exponential Sequence X (N) (A)
No ratings yet
Experiment No. 1: Objective: Write A MATLAB Program To Generate An Exponential Sequence X (N) (A)
53 pages
Mil HDBK 217F CN1
No ratings yet
Mil HDBK 217F CN1
37 pages
Delphi: A Software Controller For Mobile Network Selection
No ratings yet
Delphi: A Software Controller For Mobile Network Selection
16 pages
Lesson 5 Deep Neural Net Optimization Tuning Interpretability
100% (1)
Lesson 5 Deep Neural Net Optimization Tuning Interpretability
105 pages
Euro Informs: - Mmxiiirome
No ratings yet
Euro Informs: - Mmxiiirome
1 page
Network Quick Revision
No ratings yet
Network Quick Revision
5 pages
SLS Corrected 1.4.16 PDF
No ratings yet
SLS Corrected 1.4.16 PDF
362 pages
2VV-33C-R4-V5 Product Specifications
No ratings yet
2VV-33C-R4-V5 Product Specifications
5 pages
DIY Fume Extractor
No ratings yet
DIY Fume Extractor
10 pages
Reinforcement Learningfor Logisticsand Supply Chain Management
No ratings yet
Reinforcement Learningfor Logisticsand Supply Chain Management
92 pages
Analog Bits: Generating Discrete Data Using Diffusion Models With Self-Conditioning
No ratings yet
Analog Bits: Generating Discrete Data Using Diffusion Models With Self-Conditioning
23 pages
Advancements in Hybrid Energy
No ratings yet
Advancements in Hybrid Energy
23 pages
IDOE's Design Approach Strategies:: Kyu Collective
No ratings yet
IDOE's Design Approach Strategies:: Kyu Collective
3 pages