Unsupervised Learning:
Deep Auto-encoder
Unsupervised Learning
“We expect unsupervised learning to become far more important
in the longer term. Human and animal learning is largely
unsupervised: we discover the structure of the world by observing
it, not by being told the name of every object.”
– LeCun, Bengio, Hinton, Nature 2015
Auto-encoder
Usually <784
Compact
NN code representation of
Encoder the input object
28 X 28 = 784
Learn together
NN Can reconstruct
code the original object
Decoder
Recap: PCA
2
Minimize ( 𝑥 − ^
𝑥 )
As close as possible
encode decode
𝑥 𝑐 ^
𝑥
𝑊 𝑊
𝑇
hidden layer
Input layer (linear) output layer
Bottleneck later
Output of the hidden layer is the code
Deep Auto-encoder
• Of course, the auto-encoder can be deep
As close as possible
Output Layer
Input Layer
Layer
Layer
Layer
bottle
Layer
…
Layer
…
Layer
𝑇
𝑊1 𝑊 2 𝑊 2 𝑊 1𝑇
𝑥 Code 𝑥^
Reference: Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the
dimensionality of data with neural networks." Science 313.5786 (2006): 504-507
Deep Auto-encoder
Original
Image
784
784
30
PCA
Deep
Auto-encoder
500
250
250
500
30
1000
1000
784
784
784 784
1000
2
500
784
250
2
250
500
1000
784
More: Contractive auto-encoder
Ref: Rifai, Salah, et al. "Contractive
Auto-encoder auto-encoders: Explicit invariance during
feature extraction.“ Proceedings of the
28th International Conference on
Machine Learning (ICML-11). 2011.
• De-noising auto-encoder
As close as possible
encode decode
𝑐
𝑥 𝑥′ ^
𝑥
Add
noise
Vincent, Pascal, et al. "Extracting and composing robust features
with denoising autoencoders." ICML, 2008.
Deep Auto-encoder - Example
NN
𝑐
Encoder
PCA
32-dim
Pixel -> tSNE
Auto-encoder – Text Retrieval
Vector Space Model Bag-of-word
this 1
is 1
word string:
query “This is an apple” a 0
an 1
apple 1
pen 0
document
…
Semantics are not
considered.
Auto-encoder – Text Retrieval
The documents talking about
the same thing will have close
code.
2 query
125
250
500
2000
Bag-of-word
(document or query)
Auto-encoder –
Similar Image Search
Retrieved using Euclidean distance in pixel intensity space
(Images from Hinton’s slides on Coursera)
Reference: Krizhevsky, Alex, and Geoffrey E. Hinton. "Using very deep
autoencoders for content-based image retrieval." ESANN. 2011.
Auto-encoder –
Similar Image Search
code
8192
1024
4096
2048
256
512
32x32
(crawl millions of images from the Internet)
Retrieved using Euclidean distance in pixel intensity space
retrieved using 256 codes
Auto-encoder As close as
for CNN possible
Deconvolution
Unpooling Convolution
Deconvolution Pooling
Unpooling Convolution
code
Deconvolution Pooling
CNN -Unpooling
14 x 14 28 x 28
Alternative: simply
Source of image :
repeat the values https://leonardoaraujosantos.gitbooks.io/artificial-
inteligence/content/image_segmentation.html
CNN
- Deconvolution
+ +
=
+
Auto-encoder – Pre-training DNN
• Greedy Layer-wise Pre-training again
output 10
500
Target
1000 784 ^
𝑥
W1 ’
1000 1000
W1
Input 784 Input 784 𝑥
Auto-encoder – Pre-training DNN
• Greedy Layer-wise Pre-training again
output 10
500 1000 ^1
𝑎
W2 ’
Target
1000 1000
W2
1000 1000 𝑎
1
fix W1
Input 784 Input 784 𝑥
Auto-encoder – Pre-training DNN
• Greedy Layer-wise Pre-training again
output 10 1000 ^2
𝑎
W3 ’
500 500
W3
Target
2
1000 1000 𝑎
fix W2
1000 1000 𝑎
1
fix W1
Input 784 Input 784 𝑥
Auto-encoder – Pre-training DNN
Find-tune by
• Greedy Layer-wise Pre-training again backpropagation
output 10 output 10
Random
W4
init
500 500
W3
Target
1000 1000
W2
1000 1000
W1
Input 784 Input 784 𝑥
NN
Next ….. code
Decoder
• Can we use decoder to generate something?
NN
Next ….. code
Decoder
• Can we use decoder to generate something?