K-Max Pooling Operation

Slide
credit from Mark Chang 1

Convolutional Neural Networks
• We need a course to talk about this topic
◦ http://cs231n.stanford.edu/syllabus.html
• However, we only have a lecture
2
Outline
• CNN(Convolutional Neural Networks) Introduction
• Evolution of CNN
• Visualizing the Features
• CNN as Artist
• Sentiment Analysis by CNN
3
Outline
• CNN as Artist
4
Image Recognition
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
5
Image Recognition
6
Local Connectivity
Neurons connect to a small
region
7
Parameter Sharing
• The same feature in different positions
Neurons
share the same weights
8
Parameter Sharing
• Different features in the same position
Neurons
have different weights
9
Convolutional Layers
weights weights
height
depth
depth width shared weight width
10
depth = 1 depth = 2
a1 wb1
b1 b1 =wb1 a1 +wb2 a2
wc1 wb2
c1 c1 =wc1 a1 +wc2 a2
wc2
a2 wb1
b2 b2 =wb1 a2 +wb2 a3
wc1 wb2
c2 c2 =wc1 a2 +wc2 a3
a3 wc2
11
depth = 2 depth = 2
b1 wc2
wc1 c1 = a1 wc1 + b1 wc2
a1 c1
wc4 + a2 wc3 + b2 wc4
b2 d1
wc2
wc3
a2 wc1 c2 = a2 wc1 + b2 wc2
c2
wc4 + a3 wc3 + b3 wc4
b3
wc3 d2
a3
12
depth = 2 depth = 2
b1
wd2 c1 = a1 wc1 + b1 wc2
a1 wd1 c1
+ a2 wc3 + b2 wc4
b2 wd4
d1
d1 = a1 wd1 + b1 wd2
wd3 + a2 wd3 + b2 wd4
wd2
a2 c2 = a2 wc1 + b2 wc2
wd1 c2
+ a3 wc3 + b3 wc4
b3 wd4
d2 = a2 wd1 + b2 wd2
d2
a3 wd3 + a3 wd3 + b3 wd4
13
A B C
A B C D
14
Hyper-parameters of CNN
• Stride • Padding
Stride = 1 Padding = 0
0 0
Stride = 2 Padding = 1
15
Example
Output Stride = 2
Volume (3x3x2)
Filter
(3x3x3)
Input
Volume (7x7x3) Padding = 1
http://cs231n.github.io/convolutional-networks/
16
17
18
19
Relationship with Convolution
X
x[k]
y[n] = x[k]w[n k]
k
k
x[n] w[0 k]
n k
w[n]
y[n]
n
n
y[0] = x[ 2]w[2] + x[ 1]w[1] + x[0]w[0]
20
X
x[k]
y[n] = x[k]w[n k]
k
k
x[n] w[1 k]
n k
w[n]
y[n]
n
n
y[1] = x[ 1]w[2] + x[0]w[1] + x[2]w[0]
21
X
x[k]
y[n] = x[k]w[n k]
k
k
x[n] w[2 k]
n k
w[n]
y[n]
n
n
y[2] = x[0]w[2] + x[1]w[1] + x[2]w[0]
22
X
x[k]
y[n] = x[k]w[n k]
k
k
x[n] w[4 k]
n k
w[n]
y[n]
n n
y[4] = x[2]w[2] + x[3]w[1] + x[4]w[0]
23
Nonlinearity
• Rectified Linear (ReLU)
⇢
nin if nin > 0
nin n nout =
0 otherwise
2 3 2 3
1 1
647 6 47
6 7 ReLU 6 7
4 35 4 05
1 1
24
Why ReLU?
• Easy to train
• Avoid gradient vanishing problem
saturated
Sigmoid gradient ≈ 0 ReLU not saturated
25
Why ReLU?
• Biological reason
strong stimulation weak stimulation
v v
t t
neuron neuron
ReLU
strong stimulation
weak stimulation
26
Pooling Layer
1 3 2 4
5 7 6 8
0 0 3 3
5 5 0 0 no weights
Maximum Average
Pooling Pooling
7 8 4 5
5 3 5 3 no overlap depth = 1
Max(1,3,5,7) = 7 Avg(1,3,5,7) = 4
Max(0,0,5,5) = 5
27
Why “Deep” Learning?
28
Visual Perception of Human
http://www.nature.com/neuro/journal/v8/n8/images/nn0805-975-F1.jpg
29
Visual Perception of Computer
Input Convolutional
Layer Layer
Pooling Convolutional
Layer Layer Pooling
Layer
Receptive Fields
Receptive Fields
30
Convolutional Max-pooling
Layer with Layer with
Receptive Fields: Width =3, Height = 3
Input Layer
Filter Responses
Input Image Filter Responses

31
Fully-Connected Layer
• Fully-Connected Layers : Global feature extraction
• Softmax Layer: Classifier
Fully-Connected
Convolutional Layer Softmax
Input Input Layer Convolutional Layer
Image Layer Pooling Layer
Layer Pooling
Layer 5
7
Class
Label
32
• Alexnet
http://vision03.csail.mit.edu/cnn_art/data/single_layer.png
33
Training
• Forward Propagation
n2(out) n2(in) w21 n1(out)
n2 n1
n2(in) = w21 n1(out)

n2(out) = g(n2(in) ), g is activation function
34
Training
• Update weights
Cost
function: J n2 n1
@J @J @n2(out) @n2(in)
=
@w21 @n2(out) @n2(in) @w21
@J
w21 w21 ⌘
@w21
@J @n2(out) @n2(in)
) w21 w21 ⌘
@n2(out) @n2(in) @w21
35
Training
• Update weights
Cost
function: J n2 n1
n2(out) = g(n2(in) ), n2(in) = w21 n1(out)

@n2(out) 0 @n2(in)
) = g (n2(in) ), = n1(out)
@n2(in) @w21
@J @n2(out) @n2(in)
w21 w21 ⌘
@n2(out) @n2(in) @w21
@J
) w21 w21 ⌘ g 0 (n2(in) )n1(out)
@n2(out)
36
Training
• Propagate to the previous layer
n2(out) n2(in) n1(out) n1(in)
Cost
function: J n2 n1
@J @J @n2(out) @n2(in) @n1(out)

=
@n1(in) @n2(out) @n2(in) @n1(out) @n1(in)
37
Training Convolutional Layers
• example:
Convolutional
Layer
wb1 a1
b1 wb2
output a2 input
wb1
b2
wb2 a3
To simplify the notations, in the following slides, we make:

b1 means b1(in) , a1 means a1(out) , and so on.
38
• Forward propagation
Convolutional
Layer
wb1 a1
b1 = wb1 a1 + wb2 a2 b1 wb2
a2 input
wb1
b2 = wb1 a2 + wb2 a3 b2
wb2 a3
39
• Update weights
@b1
@wb1
@J
a1
@b1 b1 wb1
Cost
function: J wb1 a2
b2
@J
a3
@b2 @b2
@wb1
@J @b1 @J @b2
wb1 wb1 ⌘( + )
@b1 @wb1 @b2 @wb1
40
• Update weights
@b1
b1 = wb1 a1 + wb2 a2 = a1
@wb1
@J
a1
@b1 b1 wb1
Cost
function: J wb1 a2
b2
@J
a3
@b2 @b2
b2 = wb1 a2 + wb2 a3 = a2
@wb1
@J @J
wb1 wb1 ⌘( a1 + a2 )
@b1 @b2
41
• Update weights
@b1
@wb2
@J
a1
@b1 b1
Cost
function: J wb2 a2
b2 wb2
@J
a3
@b2 @b2
@wb2
@J @b1 @J @b2
wb2 wb2 ⌘( + )
@b1 @wb2 @b2 @wb2
42
• Update weights
@b1
b1 = wb1 a1 + wb2 a2 = a2
@wb2
@J
a1
@b1 b1
Cost
function: J wb2 a2
b2 wb2
@J
a3
@b2 @b2
b2 = wb1 a2 + wb2 a3 = a3
@wb2
@J @J
wb2 wb2 ⌘( a2 + a3 )
@b1 @b2
43
@b1 @b1
@a1 @a2
@J @J @b1
@b1 a1
b1 @b1 @a1
Cost @J @b1 @J @b2
function: J a2 +
@b1 @a2 @b2 @a2
b2
@J @J @b2
a3
@b2 @b2 @a3
@b2 @b2
@a3 @a2
44
@b1 @b1
b1 = wb1 a1 + wb2 a2 = w b1 = wb2
@a1 @a2
@J @J
@b1 a1 wb1
b1 @b1
Cost @J @J
function: J a2 wb1 + wb2
@b1 @b2
b2
@J @J
a3 wb2
@b2 @b2
b2 = wb1 a2 + wb2 a3 @b2 = wb2 @b2 = w
@a3 b1
@a2
45
Max-Pooling Layers during Training
• Pooling layers have no weights
• No need to update weights Max-pooling
b1 = max(a1 , a2 ) a1
b1 a 1 > a2
a2
b2 a 2 > a3
b2 = max(a2 , a3 )
a3
⇢
a2 if a2 a3
= ⇢
a3 otherwise @b2 1 if a2 a3
=
@a2 0 otherwise
46
@b1 @b1
@J =1 =0
@a1 @a2
@b1 @J
a1 @b1
b1 a 1 > a2
Cost @J
function: J a2
a 2 > a3 @b2
b2
@J a3
@b2
@b2
=1 @b2
@a2 =0
@a3
47
• if a1 = a2 ??
◦ Choose the node with smaller index
@J
a1 = a2 = a3
@b1 @J
a1
b1 @b1
Cost
@J
function: J a2
@b2
b2
@J a3
@b2
48
Avg-Pooling Layers during Training
• Pooling layers have no weights
• No need to update weights Avg-pooling
1 a1
b1 = (a1 + a2 ) b1
2
a2
1 b2
b2 = (a2 + a3 ) a3
2
@b2 1 @b2 1
= =
@a2 2 @a3 2
49
Avg-Pooling Layers during Training
@b1 @b1 1
= =
@J @a1 @a2 2
1 @J
@b1
a1 2 @b1
b1
Cost
1 @J @J
function: J a2 ( + )
2 @b1 @b2
b2
a3 1 @J
@J
2 @b2
@b2
@b2 @b2 1
= =
@a2 @a3 2
50
ReLU during Training
⇢
nin if nin > 0
nin n nout =
0 otherwise
⇢
@nout 1 if nin > 1
=
@nin 0 otherwise
51
Training CNN
52
Outline
• CNN as Artist
53
LeNet
◦ Paper:
http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf
Yann LeCun http://yann.lecun.com/exdb/lenet/
54
ImageNet Challenge
• ImageNet Large Scale Visual Recognition Challenge
◦ http://image-net.org/challenges/LSVRC/
• Dataset :
◦ 1000 categories
◦ Training: 1,200,000
◦ Validation: 50,000
◦ Testing: 100,000
http://vision.stanford.edu/Datasets/collage_s.png
55
ImageNet Challenge
http://www.qingpingshan.com/uploads/allimg/160818/1J22QI5-0.png
56
AlexNet (2012)
• Paper:
• The resurgence of Deep Learning
Alex Krizhevsky Geoffrey Hinton

57
VGGNet (2014)
• Paper: https://arxiv.org/abs/1409.1556
D: VGG16
E: VGG19
All filters are 3x3
58
VGGNet
• More layers & smaller filters (3x3) is better
• More non-linearity, fewer parameters
One 5x5 filter Two 3x3 filters

• Parameters: • Parameters:
5x5 = 25 3x3x2 = 18
• Non-linear:1 • Non-linear:2
59
VGG 19
maxpool maxpool maxpool maxpool

maxpool depth=256 depth=512 depth=512 size=4096
depth=64 depth=128 3x3 conv 3x3 conv 3x3 conv FC1
3x3 conv 3x3 conv conv3_1 conv4_1 conv5_1 FC2
conv1_1 conv2_1 conv3_2 conv4_2 conv5_2 size=1000
conv1_2 conv2_2 conv3_3 conv4_3 conv5_3 softmax
conv3_4 conv4_4 conv5_4
60
GoogLeNet (2014)
• Paper:
http://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf
22 layers deep network
Inception
Module
61
Inception Module
• Best size?
◦ 3x3? 5x5?
• Use them all, and combine
62
Inception Module
1x1 convolution
3x3 convolution
5x5 convolution
Previous Filter
layer Concatenate
3x3 max-pooling
63
Inception Module with Dimension Reduction
• Use 1x1 filters to reduce dimension
64
Inception Module with Dimension Reduction
Input size Output size
1x1x256 1x1x128
256 128
1x1 convolution
(1x1x256x128)
Previous Reduced
layer dimension
65
ResNet (2015)
• Residual Networks
• 152 layers
66
ResNet
• Residual learning: a building block
Residual
function
67
Residual Learning with Dimension Reduction
• using 1x1 filters
68
Pretrained Model Download
• http://www.vlfeat.org/matconvnet/pretrained/
◦ Alexnet:
◦ http://www.vlfeat.org/matconvnet/models/imagenet-matconvnet-
alex.mat
◦ VGG19:
◦ http://www.vlfeat.org/matconvnet/models/imagenet-vgg-verydeep-
19.mat
◦ GoogLeNet:
◦ http://www.vlfeat.org/matconvnet/models/imagenet-googlenet-dag.mat
◦ ResNet
◦ http://www.vlfeat.org/matconvnet/models/imagenet-resnet-152-dag.mat
69
Using Pretrained Model
• Lower layers：edge, blob, texture (more general)
• Higher layers : object part (more specific)
70
Transfer Learning
• The Pretrained Model is • If your data is similar to
trained on ImageNet the ImageNet data
dataset ◦ Fix all CNN Layers
◦ Train FC layer
FC layer FC layer
Conv layer Conv layer
… …
… …
Your
Your
Labeled data
Labeled data
Labeled data
ImageNet
Labeled data data
data
data
71
Transfer Learning
• The Pretrained Model is • If your data is far different
trained on ImageNet from the ImageNet data
dataset ◦ Fix lower CNN Layers
◦ Train higher CNN and FC layers
FC layer FC layer
… …
… …
Your
Your
Labeled data
Labeled data
Labeled data
ImageNet
Labeled data data
data
data
72
Tensorflow Transfer Learning Example
• https://www.tensorflow.org/versions/r0.11/how_tos/styl
e_guide.html
daisy dandelion roses tulips sunflowers

634 899 642 800 700
photos photos photos photos photos
http://download.tensorflow.org/example_images/flower_photos.tgz
Tensorflow Transfer Learning Example
Fix these layers Train this layer
Outline
• CNN as Artist
75
Visualizing CNN
76
Visualizing CNN
filter
flower
response
CNN
random filter
noise response
CNN
77
Gradient Ascent
• Magnify the filter response
lower higher
random filter score score
noise: x response: f
F
fi,j
X
score: F = fi,j
i,j
x
@F
gradient:
@x
78
Gradient Ascent
• Magnify the filter response
lower higher
random filter score score
noise: x response: f
F
fi,j
update x
@F
x x+⌘ x
@x @F
gradient:
learning rate @x
79
Gradient Ascent
80
Different Layers of Visualization
CNN
81
Multiscale Image Generation
visualize resize
visualize
resize
visualize
82
Multiscale Image Generation
83
Deep Dream
• https://research.googleblog.com/2015/06/inceptionism-
going-deeper-into-neural.html
• Source code:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/ex
amples/tutorials/deepdream/deepdream.ipynb
http://download.tensorflow.org/example_images
/flower_photos.tgz
84
Deep Dream
85
Deep Dream
86
Outline
• CNN as Artist
87
Neural Art
• Source code : https://github.com/ckmarkoh/neuralart_tensorflow
content artwork
http://www.taipei-
101.com.tw/upload/news/201502/2015
021711505431705145.JPG
style
https://github.com/andersbll/neural_ar
tistic_style/blob/master/images/starry_
night.jpg?raw=true
88
The Mechanism of Painting
Artist Brain
Scene Style ArtWork
Computer Neural Networks

89
Misconception
90
Content Generation
Content Neural
Artist Brain
Stimulation
Minimize
Canvas the
difference
Draw
91
Content Generation
Content Width*Height Filter
VGG19 Responses
Depth
Minimize
Canvas the
difference
Update the
Result
color of
the pixels
92
Content Generation
Input Layer l’s Filter
Input Layer l’s Filter l
Photo: Responses:
Canvas: Responses:
Depth (i)
Depth (i)
Width*Height (j) Width*Height (j)
93
Content Generation
• Backward Propagation
Layer l’s Filter l
VGG19 Responses:
Input
Canvas:
Update
Canvas
Learning Rate
94
Content Generation
95
Content Generation
VGG19
conv1_2 conv2_2 conv3_4 conv4_4 conv5_1 conv5_2
96
Style Generation
Artwork VGG19 Filter Responses Gram Matrix
Depth
Depth
G
Depth
Width*Height
Position- Position-
dependent independent
97
Style Generation
Layer l’s Filter Responses
Width*Height Gram Matrix
1. .5 1. .5 .25 1.
Depth
.5 .5 .25 .5
Depth
.5 .25 .25
1. 1. .5 1.
k1 k2
Depth
k2
k1
98
Style Generation
Input Layer l’s Input Layer l’s
Artwork: Gram Matrix Canvas: Gram Matrix
Layer l’s
Filter Responses
99
Style Generation
Filter Gram
Style VGG19 Responses Matrix
G
Minimize
the
Canvas difference
G
Update the color of

Result the pixels
100
Style Generation
101
Style Generation
VGG19
Conv1_1 Conv1_1 Conv1_1 Conv1_1 Conv1_1

Conv2_1 Conv2_1 Conv2_1 Conv2_1
Conv3_1 Conv3_1 Conv3_1
Conv4_1 Conv4_1
Conv5_1
102
Artwork Generation
VGG19 Filter Responses
Gram Matrix
103
Artwork Generation
VGG19 VGG19
Conv1_1
Conv4_2 Conv2_1
Conv3_1
Conv4_1
Conv5_1
104
Artwork Generation
105
Content v.s. Style
0.15 0.05
0.02 0.007
106
Neural Doodle
• Source code: https://github.com/alexjc/neural-doodle
content semantic maps result
style
107
Neural Doodle
• Image analogy
108
Neural Doodle
• Image analogy
恐怖連結，慎入！
https://raw.githubusercontent.com/awentzonline/
image-analogies/master/examples/images/trump-
image-analogy.jpg
109
Real-time Texture Synthesis
• Paper: https://arxiv.org/pdf/1604.04382v1.pdf
◦ GAN: https://arxiv.org/pdf/1406.2661v1.pdf
◦ VAE: https://arxiv.org/pdf/1312.6114v10.pdf
• Source Code : https://github.com/chuanli11/MGANs
110
Outline
• CNN as Artist
111
A Convolutional Neural Network for Modelling
Sentences
• Source code:
https://github.com/FredericGodin/DynamicCNN
112
Drawbacks of Recursive Neural
Networks(RvNN)
• Need human-labeled syntax tree during training
Train
RvNN
RvNN
RvNN
This is a dog
RvNN
Word
vector
This is a dog
113
Drawbacks of Recursive Neural
Networks(RvNN)
• Ambiguity in natural language
http://3rd.mafengwo.cn/travels/info_wei http://www.appledaily.com.tw/realtimen
bo.php?id=2861280 ews/article/new/20151006/705309/
114
Element-wise 1D operations on word vectors
• 1D Convolution or 1D Pooling
Represented
by
operation operation
This is a
This is a
115
From RvNN to CNN
• RvNN • CNN
Different
RvNN conv3 conv layers
Same
RvNN
RvNN conv2 conv2
RvNN conv1 conv1 conv1
This is a dog This is a dog
116
CNN with Max-Pooling Layers
• Similar to syntax tree
• But human-labeled syntax tree is not needed
conv2 Max conv2

Pooling
pool1 pool1 pool1 pool1
conv1 conv1 conv1 conv1 conv1
This is a dog This is a dog
117
Sentiment Analysis by CNN
• Use softmax layer to classify the sentiments
positive negative
softmax softmax
conv2 conv2
conv1 conv1 conv1 conv1 conv1 conv1
This movie is awesome This movie is awful
118
• Build the “correct syntax tree” by training
negativeerror negative
softmax softmax
conv2 conv2
Backward
propagation
This movie is awesome This movie is awesome
119
• Build the “correct syntax tree” by training
negative positive
softmax softmax
conv2
Update conv2
the weights
This movie is awesome This movie is awesome
120
Multiple Filters
• Richer features than RNN
filter21 filter22 Filter23
filter11 filter12 Filter13 filter11 filter12 Filter13
This is a
121
Sentence can’t be easily resized
• Image can be easily resized • Sentence can’t be easily
resized
全台灣最高樓在台北市
resize
resize
全台灣最高的高樓在台北市
全台灣最高樓在台北
台灣最高樓在台北
122
Various Input Size
• Convolutional layers and pooling layers
◦ can handle input with various size
pool1 pool1 pool1
conv1 conv1 conv1 conv1 conv1
the dog run This is a dog
123
Various Input Size
• Fully-connected layer and softmax layer
◦ need fixed-size input
softmax softmax
fc fc
The dog run This is a dog
124
k-max Pooling
• choose the k-max values
• preserve the order of input values
• variable-size input, fixed-size output
12 21 15 13 7 8
3-max 3-max
pooling pooling
12 5 21 15 7 4 9 13 4 1 7 8
125
Wide Convolution
• Ensures that all weights reach the entire sentence
conv conv conv conv conv conv conv conv
Narrow convolution Wide convolution
126
Dynamic k-max Pooling
L l
kl = max(ktop , d se)
L
l : index of current layer
kl : k of current layer ktop wide convolution
& k-max pooling
ktop : k of top layer
L : total number of layers L
s : length of input sentence kl
wide convolution &
k-max pooling
ktop and L are constants
s
127
L l
L
ktop = 3
conv & pooling
L=2
2 1
k1 = max(3, d ⇥ 10e) = 5 conv & pooling
2
s = 10
128
L l
L
ktop = 3
conv & pooling
L=2
2 1
2
s = 14
129
L l
L
ktop = 3
conv & pooling
L=2
2 1
2
s=8
130
Wide convolution &

Dynamic k-max pooling
131
Convolutional Neural Networks for Sentence
Classification
• Paper: http://www.aclweb.org/anthology/D14-1181
• Sourcee code:
https://github.com/yoonkim/CNN_sentence
132
Static & Non-Static Channel
• Pretrained by word2vec
• Static: fix the values during training
• Non-Static: update the values during training
133
About the Lecturer
Mark Chang
HTC Research & Healthcare
Deep Learning Algorithms
Research Engineer
• Email: ckmarkoh at gmail dot com

• Blog: http://cpmarkchang.logdown.com
• Github: https://github.com/ckmarkoh
• Slideshare: http://www.slideshare.net/ckmarkohchang
• Youtube: https://www.youtube.com/channel/UCckNPGDL21aznRhl3EijRQw
134

K-Max Pooling Operation

Uploaded by

Copyright:

Available Formats

K-Max Pooling Operation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

K-Max Pooling Operation

Uploaded by

Copyright:

Available Formats

Slide

credit from Mark Chang 1

• However, we only have a lecture

depth width shared weight width

y[0] = x[ 2]w[2] + x[ 1]w[1] + x[0]w[0]

y[1] = x[ 1]w[2] + x[0]w[1] + x[2]w[0]

y[2] = x[0]w[2] + x[1]w[1] + x[2]w[0]

y[4] = x[2]w[2] + x[3]w[1] + x[4]w[0]

Input Image Filter Responses

n2(in) = w21 n1(out)

n2(out) = g(n2(in) ), n2(in) = w21 n1(out)

@J @J @n2(out) @n2(in) @n1(out)

To simplify the notations, in the following slides, we make:

Yann LeCun http://yann.lecun.com/exdb/lenet/

Alex Krizhevsky Geoffrey Hinton

One 5x5 filter Two 3x3 filters

maxpool maxpool maxpool maxpool

22 layers deep network

• Use them all, and combine

daisy dandelion roses tulips sunflowers

Scene Style ArtWork

Computer Neural Networks

conv1_2 conv2_2 conv3_4 conv4_4 conv5_1 conv5_2

Update the color of

Conv1_1 Conv1_1 Conv1_1 Conv1_1 Conv1_1

• Source Code : https://github.com/chuanli11/MGANs

RvNN conv2 conv2

RvNN conv1 conv1 conv1

This is a dog This is a dog

conv2 Max conv2

conv1 conv1 conv1 conv1 conv1

This is a dog This is a dog

pool1 pool1 pool1 pool1

conv1 conv1 conv1 conv1 conv1 conv1

This movie is awesome This movie is awful

conv1 conv1 conv1 conv1 conv1 conv1

This movie is awesome This movie is awesome

conv1 conv1 conv1 conv1 conv1 conv1

This movie is awesome This movie is awesome

filter21 filter22 Filter23

filter11 filter12 Filter13 filter11 filter12 Filter13

pool1 pool1 pool1

conv1 conv1 conv1 conv1 conv1

the dog run This is a dog

The dog run This is a dog

conv conv conv conv conv conv conv conv

Narrow convolution Wide convolution

ktop and L are constants

Wide convolution &

• Email: ckmarkoh at gmail dot com

You might also like