K-Max Pooling Operation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 134

Slide

credit from Mark Chang 1


Convolutional Neural Networks
• We need a course to talk about this topic
◦ http://cs231n.stanford.edu/syllabus.html

• However, we only have a lecture

2
Outline
• CNN(Convolutional Neural Networks) Introduction
• Evolution of CNN
• Visualizing the Features
• CNN as Artist
• Sentiment Analysis by CNN

3
Outline
• CNN(Convolutional Neural Networks) Introduction
• Evolution of CNN
• Visualizing the Features
• CNN as Artist
• Sentiment Analysis by CNN

4
Image Recognition

http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf

5
Image Recognition

6
Local Connectivity
Neurons connect to a small
region

7
Parameter Sharing
• The same feature in different positions

Neurons
share the same weights
8
Parameter Sharing
• Different features in the same position

Neurons
have different weights
9
Convolutional Layers

weights weights
height
depth

depth width shared weight width

10
Convolutional Layers

depth = 1 depth = 2

a1 wb1
b1 b1 =wb1 a1 +wb2 a2
wc1 wb2

c1 c1 =wc1 a1 +wc2 a2
wc2
a2 wb1
b2 b2 =wb1 a2 +wb2 a3
wc1 wb2
c2 c2 =wc1 a2 +wc2 a3
a3 wc2

11
Convolutional Layers
depth = 2 depth = 2

b1 wc2
wc1 c1 = a1 wc1 + b1 wc2
a1 c1
wc4 + a2 wc3 + b2 wc4
b2 d1
wc2
wc3
a2 wc1 c2 = a2 wc1 + b2 wc2
c2
wc4 + a3 wc3 + b3 wc4
b3
wc3 d2
a3

12
Convolutional Layers
depth = 2 depth = 2

b1
wd2 c1 = a1 wc1 + b1 wc2
a1 wd1 c1
+ a2 wc3 + b2 wc4
b2 wd4
d1
d1 = a1 wd1 + b1 wd2
wd3 + a2 wd3 + b2 wd4
wd2
a2 c2 = a2 wc1 + b2 wc2
wd1 c2
+ a3 wc3 + b3 wc4
b3 wd4
d2 = a2 wd1 + b2 wd2
d2
a3 wd3 + a3 wd3 + b3 wd4

13
Convolutional Layers
A B C

A B C D

14
Hyper-parameters of CNN
• Stride • Padding

Stride = 1 Padding = 0

0 0

Stride = 2 Padding = 1

15
Example
Output Stride = 2
Volume (3x3x2)
Filter
(3x3x3)

Input
Volume (7x7x3) Padding = 1
http://cs231n.github.io/convolutional-networks/

16
Convolutional Layers

http://cs231n.github.io/convolutional-networks/

17
Convolutional Layers

http://cs231n.github.io/convolutional-networks/

18
Convolutional Layers

http://cs231n.github.io/convolutional-networks/

19
Relationship with Convolution
X
x[k]
y[n] = x[k]w[n k]
k
k
x[n] w[0 k]

n k
w[n]
y[n]
n
n

y[0] = x[ 2]w[2] + x[ 1]w[1] + x[0]w[0]

20
Relationship with Convolution
X
x[k]
y[n] = x[k]w[n k]
k
k
x[n] w[1 k]

n k
w[n]
y[n]
n
n

y[1] = x[ 1]w[2] + x[0]w[1] + x[2]w[0]

21
Relationship with Convolution
X
x[k]
y[n] = x[k]w[n k]
k
k
x[n] w[2 k]

n k
w[n]
y[n]
n
n

y[2] = x[0]w[2] + x[1]w[1] + x[2]w[0]

22
Relationship with Convolution
X
x[k]
y[n] = x[k]w[n k]
k
k
x[n] w[4 k]

n k
w[n]
y[n]
n n

y[4] = x[2]w[2] + x[3]w[1] + x[4]w[0]

23
Nonlinearity
• Rectified Linear (ReLU)


nin if nin > 0
nin n nout =
0 otherwise
2 3 2 3
1 1
647 6 47
6 7 ReLU 6 7
4 35 4 05
1 1

24
Why ReLU?
• Easy to train
• Avoid gradient vanishing problem

saturated
Sigmoid gradient ≈ 0 ReLU not saturated

25
Why ReLU?
• Biological reason
strong stimulation weak stimulation
v v

t t
neuron neuron

ReLU
strong stimulation

weak stimulation

26
Pooling Layer
1 3 2 4
5 7 6 8
0 0 3 3
5 5 0 0 no weights
Maximum Average
Pooling Pooling

7 8 4 5
5 3 5 3 no overlap depth = 1
Max(1,3,5,7) = 7 Avg(1,3,5,7) = 4
Max(0,0,5,5) = 5

27
Why “Deep” Learning?

28
Visual Perception of Human

http://www.nature.com/neuro/journal/v8/n8/images/nn0805-975-F1.jpg

29
Visual Perception of Computer
Input Convolutional
Layer Layer
Pooling Convolutional
Layer Layer Pooling
Layer

Receptive Fields
Receptive Fields
30
Visual Perception of Computer
Convolutional Max-pooling
Layer with Layer with
Receptive Fields: Width =3, Height = 3

Input Layer

Filter Responses

Input Image Filter Responses


31
Fully-Connected Layer
• Fully-Connected Layers : Global feature extraction
• Softmax Layer: Classifier
Fully-Connected
Convolutional Layer Softmax
Input Input Layer Convolutional Layer
Image Layer Pooling Layer
Layer Pooling
Layer 5

7
Class
Label

32
Visual Perception of Computer
• Alexnet
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf

http://vision03.csail.mit.edu/cnn_art/data/single_layer.png

33
Training
• Forward Propagation
n2(out) n2(in) w21 n1(out)

n2 n1

n2(in) = w21 n1(out)


n2(out) = g(n2(in) ), g is activation function

34
Training
• Update weights
n2(out) n2(in) w21 n1(out)
Cost
function: J n2 n1

@J @J @n2(out) @n2(in)
=
@w21 @n2(out) @n2(in) @w21
@J
w21 w21 ⌘
@w21
@J @n2(out) @n2(in)
) w21 w21 ⌘
@n2(out) @n2(in) @w21
35
Training
• Update weights
n2(out) n2(in) w21 n1(out)
Cost
function: J n2 n1

n2(out) = g(n2(in) ), n2(in) = w21 n1(out)


@n2(out) 0 @n2(in)
) = g (n2(in) ), = n1(out)
@n2(in) @w21
@J @n2(out) @n2(in)
w21 w21 ⌘
@n2(out) @n2(in) @w21
@J
) w21 w21 ⌘ g 0 (n2(in) )n1(out)
@n2(out)
36
Training
• Propagate to the previous layer
n2(out) n2(in) n1(out) n1(in)
Cost
function: J n2 n1

@J @J @n2(out) @n2(in) @n1(out)


=
@n1(in) @n2(out) @n2(in) @n1(out) @n1(in)

37
Training Convolutional Layers
• example:
Convolutional
Layer

wb1 a1
b1 wb2
output a2 input
wb1
b2
wb2 a3

To simplify the notations, in the following slides, we make:


b1 means b1(in) , a1 means a1(out) , and so on.

38
Training Convolutional Layers
• Forward propagation
Convolutional
Layer

wb1 a1
b1 = wb1 a1 + wb2 a2 b1 wb2
a2 input
wb1
b2 = wb1 a2 + wb2 a3 b2
wb2 a3

39
Training Convolutional Layers
• Update weights
@b1
@wb1
@J
a1
@b1 b1 wb1
Cost
function: J wb1 a2
b2
@J
a3
@b2 @b2
@wb1
@J @b1 @J @b2
wb1 wb1 ⌘( + )
@b1 @wb1 @b2 @wb1

40
Training Convolutional Layers
• Update weights
@b1
b1 = wb1 a1 + wb2 a2 = a1
@wb1
@J
a1
@b1 b1 wb1
Cost
function: J wb1 a2
b2
@J
a3
@b2 @b2
b2 = wb1 a2 + wb2 a3 = a2
@wb1
@J @J
wb1 wb1 ⌘( a1 + a2 )
@b1 @b2

41
Training Convolutional Layers
• Update weights
@b1
@wb2
@J
a1
@b1 b1
Cost
function: J wb2 a2
b2 wb2
@J
a3
@b2 @b2
@wb2
@J @b1 @J @b2
wb2 wb2 ⌘( + )
@b1 @wb2 @b2 @wb2

42
Training Convolutional Layers
• Update weights
@b1
b1 = wb1 a1 + wb2 a2 = a2
@wb2
@J
a1
@b1 b1
Cost
function: J wb2 a2
b2 wb2
@J
a3
@b2 @b2
b2 = wb1 a2 + wb2 a3 = a3
@wb2
@J @J
wb2 wb2 ⌘( a2 + a3 )
@b1 @b2

43
Training Convolutional Layers
• Propagate to the previous layer
@b1 @b1
@a1 @a2
@J @J @b1
@b1 a1
b1 @b1 @a1
Cost @J @b1 @J @b2
function: J a2 +
@b1 @a2 @b2 @a2
b2
@J @J @b2
a3
@b2 @b2 @a3
@b2 @b2
@a3 @a2

44
Training Convolutional Layers
• Propagate to the previous layer
@b1 @b1
b1 = wb1 a1 + wb2 a2 = w b1 = wb2
@a1 @a2
@J @J
@b1 a1 wb1
b1 @b1
Cost @J @J
function: J a2 wb1 + wb2
@b1 @b2
b2
@J @J
a3 wb2
@b2 @b2
b2 = wb1 a2 + wb2 a3 @b2 = wb2 @b2 = w
@a3 b1
@a2

45
Max-Pooling Layers during Training
• Pooling layers have no weights
• No need to update weights Max-pooling

b1 = max(a1 , a2 ) a1
b1 a 1 > a2
a2
b2 a 2 > a3
b2 = max(a2 , a3 )
a3

a2 if a2 a3
= ⇢
a3 otherwise @b2 1 if a2 a3
=
@a2 0 otherwise

46
Max-Pooling Layers during Training
• Propagate to the previous layer
@b1 @b1
@J =1 =0
@a1 @a2
@b1 @J
a1 @b1
b1 a 1 > a2
Cost @J
function: J a2
a 2 > a3 @b2
b2
@J a3
@b2
@b2
=1 @b2
@a2 =0
@a3

47
Max-Pooling Layers during Training
• if a1 = a2 ??
◦ Choose the node with smaller index

@J
a1 = a2 = a3
@b1 @J
a1
b1 @b1
Cost
@J
function: J a2
@b2
b2
@J a3
@b2

48
Avg-Pooling Layers during Training
• Pooling layers have no weights
• No need to update weights Avg-pooling

1 a1
b1 = (a1 + a2 ) b1
2
a2
1 b2
b2 = (a2 + a3 ) a3
2

@b2 1 @b2 1
= =
@a2 2 @a3 2

49
Avg-Pooling Layers during Training
• Propagate to the previous layer
@b1 @b1 1
= =
@J @a1 @a2 2
1 @J
@b1
a1 2 @b1
b1
Cost
1 @J @J
function: J a2 ( + )
2 @b1 @b2
b2
a3 1 @J
@J
2 @b2
@b2
@b2 @b2 1
= =
@a2 @a3 2

50
ReLU during Training


nin if nin > 0
nin n nout =
0 otherwise


@nout 1 if nin > 1
=
@nin 0 otherwise

51
Training CNN

52
Outline
• CNN(Convolutional Neural Networks) Introduction
• Evolution of CNN
• Visualizing the Features
• CNN as Artist
• Sentiment Analysis by CNN

53
LeNet
◦ Paper:
http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf

Yann LeCun http://yann.lecun.com/exdb/lenet/

54
ImageNet Challenge
• ImageNet Large Scale Visual Recognition Challenge
◦ http://image-net.org/challenges/LSVRC/

• Dataset :
◦ 1000 categories
◦ Training: 1,200,000
◦ Validation: 50,000
◦ Testing: 100,000

http://vision.stanford.edu/Datasets/collage_s.png

55
ImageNet Challenge

http://www.qingpingshan.com/uploads/allimg/160818/1J22QI5-0.png

56
AlexNet (2012)
• Paper:
http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf
• The resurgence of Deep Learning

Alex Krizhevsky Geoffrey Hinton


57
VGGNet (2014)
• Paper: https://arxiv.org/abs/1409.1556

D: VGG16
E: VGG19
All filters are 3x3

58
VGGNet
• More layers & smaller filters (3x3) is better
• More non-linearity, fewer parameters

One 5x5 filter Two 3x3 filters


• Parameters: • Parameters:
5x5 = 25 3x3x2 = 18
• Non-linear:1 • Non-linear:2

59
VGG 19

maxpool maxpool maxpool maxpool


maxpool depth=256 depth=512 depth=512 size=4096
depth=64 depth=128 3x3 conv 3x3 conv 3x3 conv FC1
3x3 conv 3x3 conv conv3_1 conv4_1 conv5_1 FC2
conv1_1 conv2_1 conv3_2 conv4_2 conv5_2 size=1000
conv1_2 conv2_2 conv3_3 conv4_3 conv5_3 softmax
conv3_4 conv4_4 conv5_4

60
GoogLeNet (2014)
• Paper:
http://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf

22 layers deep network

Inception
Module

61
Inception Module
• Best size?
◦ 3x3? 5x5?

• Use them all, and combine

62
Inception Module

1x1 convolution

3x3 convolution

5x5 convolution
Previous Filter
layer Concatenate
3x3 max-pooling

63
Inception Module with Dimension Reduction
• Use 1x1 filters to reduce dimension

64
Inception Module with Dimension Reduction
Input size Output size
1x1x256 1x1x128
256 128

1x1 convolution
(1x1x256x128)

Previous Reduced
layer dimension
65
ResNet (2015)
• Paper: https://arxiv.org/abs/1512.03385
• Residual Networks
• 152 layers

66
ResNet
• Residual learning: a building block

Residual
function

67
Residual Learning with Dimension Reduction
• using 1x1 filters

68
Pretrained Model Download
• http://www.vlfeat.org/matconvnet/pretrained/
◦ Alexnet:
◦ http://www.vlfeat.org/matconvnet/models/imagenet-matconvnet-
alex.mat
◦ VGG19:
◦ http://www.vlfeat.org/matconvnet/models/imagenet-vgg-verydeep-
19.mat
◦ GoogLeNet:
◦ http://www.vlfeat.org/matconvnet/models/imagenet-googlenet-dag.mat
◦ ResNet
◦ http://www.vlfeat.org/matconvnet/models/imagenet-resnet-152-dag.mat

69
Using Pretrained Model
• Lower layers:edge, blob, texture (more general)
• Higher layers : object part (more specific)

http://vision03.csail.mit.edu/cnn_art/data/single_layer.png

70
Transfer Learning
• The Pretrained Model is • If your data is similar to
trained on ImageNet the ImageNet data
dataset ◦ Fix all CNN Layers
◦ Train FC layer

FC layer FC layer
Conv layer Conv layer
… …
… …
Conv layer Conv layer

Your
Your
Labeled data
Labeled data
Labeled data
ImageNet
Labeled data data
data
data

71
Transfer Learning
• The Pretrained Model is • If your data is far different
trained on ImageNet from the ImageNet data
dataset ◦ Fix lower CNN Layers
◦ Train higher CNN and FC layers

FC layer FC layer
Conv layer Conv layer
… …
… …
Conv layer Conv layer

Your
Your
Labeled data
Labeled data
Labeled data
ImageNet
Labeled data data
data
data

72
Tensorflow Transfer Learning Example
• https://www.tensorflow.org/versions/r0.11/how_tos/styl
e_guide.html

daisy dandelion roses tulips sunflowers


634 899 642 800 700
photos photos photos photos photos
http://download.tensorflow.org/example_images/flower_photos.tgz
Tensorflow Transfer Learning Example
Fix these layers Train this layer
Outline
• CNN(Convolutional Neural Networks) Introduction
• Evolution of CNN
• Visualizing the Features
• CNN as Artist
• Sentiment Analysis by CNN

75
Visualizing CNN

http://vision03.csail.mit.edu/cnn_art/data/single_layer.png

76
Visualizing CNN
filter
flower
response
CNN

random filter
noise response
CNN

77
Gradient Ascent
• Magnify the filter response
lower higher
random filter score score
noise: x response: f

F
fi,j
X
score: F = fi,j
i,j
x
@F
gradient:
@x
78
Gradient Ascent
• Magnify the filter response
lower higher
random filter score score
noise: x response: f

F
fi,j
update x
@F
x x+⌘ x
@x @F
gradient:
learning rate @x
79
Gradient Ascent

80
Different Layers of Visualization
CNN

81
Multiscale Image Generation

visualize resize
visualize
resize

visualize

82
Multiscale Image Generation

83
Deep Dream
• https://research.googleblog.com/2015/06/inceptionism-
going-deeper-into-neural.html
• Source code:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/ex
amples/tutorials/deepdream/deepdream.ipynb

http://download.tensorflow.org/example_images
/flower_photos.tgz

84
Deep Dream

85
Deep Dream

86
Outline
• CNN(Convolutional Neural Networks) Introduction
• Evolution of CNN
• Visualizing the Features
• CNN as Artist
• Sentiment Analysis by CNN

87
Neural Art
• Paper: https://arxiv.org/abs/1508.06576
• Source code : https://github.com/ckmarkoh/neuralart_tensorflow
content artwork

http://www.taipei-
101.com.tw/upload/news/201502/2015
021711505431705145.JPG

style

https://github.com/andersbll/neural_ar
tistic_style/blob/master/images/starry_
night.jpg?raw=true

88
The Mechanism of Painting
Artist Brain

Scene Style ArtWork

Computer Neural Networks


89
Misconception

90
Content Generation
Content Neural
Artist Brain
Stimulation

Minimize
Canvas the
difference

Draw
91
Content Generation
Content Width*Height Filter
VGG19 Responses

Depth
Minimize
Canvas the
difference

Update the
Result
color of
the pixels

92
Content Generation
Input Layer l’s Filter
Input Layer l’s Filter l
Photo: Responses:
Canvas: Responses:
Depth (i)

Depth (i)
Width*Height (j) Width*Height (j)

93
Content Generation
• Backward Propagation
Layer l’s Filter l
VGG19 Responses:
Input
Canvas:

Update
Canvas
Learning Rate

94
Content Generation

95
Content Generation
VGG19

conv1_2 conv2_2 conv3_4 conv4_4 conv5_1 conv5_2

96
Style Generation
Artwork VGG19 Filter Responses Gram Matrix

Depth
Depth
G

Depth
Width*Height

Position- Position-
dependent independent
97
Style Generation
Layer l’s Filter Responses
Width*Height Gram Matrix
1. .5 1. .5 .25 1.

Depth
.5 .5 .25 .5
Depth

.5 .25 .25

1. 1. .5 1.

k1 k2
Depth
k2

k1

98
Style Generation
Input Layer l’s Input Layer l’s
Artwork: Gram Matrix Canvas: Gram Matrix

Layer l’s
Filter Responses

99
Style Generation
Filter Gram
Style VGG19 Responses Matrix

G
Minimize
the
Canvas difference
G

Update the color of


Result the pixels
100
Style Generation

101
Style Generation
VGG19

Conv1_1 Conv1_1 Conv1_1 Conv1_1 Conv1_1


Conv2_1 Conv2_1 Conv2_1 Conv2_1
Conv3_1 Conv3_1 Conv3_1
Conv4_1 Conv4_1
Conv5_1
102
Artwork Generation
VGG19 Filter Responses

Gram Matrix

103
Artwork Generation

VGG19 VGG19

Conv1_1
Conv4_2 Conv2_1
Conv3_1
Conv4_1
Conv5_1

104
Artwork Generation

105
Content v.s. Style

0.15 0.05

0.02 0.007

106
Neural Doodle
• Paper: https://arxiv.org/abs/1603.01768
• Source code: https://github.com/alexjc/neural-doodle
content semantic maps result

style

107
Neural Doodle
• Image analogy

108
Neural Doodle
• Image analogy

恐怖連結,慎入!
https://raw.githubusercontent.com/awentzonline/
image-analogies/master/examples/images/trump-
image-analogy.jpg

109
Real-time Texture Synthesis
• Paper: https://arxiv.org/pdf/1604.04382v1.pdf
◦ GAN: https://arxiv.org/pdf/1406.2661v1.pdf
◦ VAE: https://arxiv.org/pdf/1312.6114v10.pdf

• Source Code : https://github.com/chuanli11/MGANs

110
Outline
• CNN(Convolutional Neural Networks) Introduction
• Evolution of CNN
• Visualizing the Features
• CNN as Artist
• Sentiment Analysis by CNN

111
A Convolutional Neural Network for Modelling
Sentences
• Paper: https://arxiv.org/abs/1404.2188
• Source code:
https://github.com/FredericGodin/DynamicCNN

112
Drawbacks of Recursive Neural
Networks(RvNN)
• Need human-labeled syntax tree during training

Train
RvNN
RvNN

RvNN

This is a dog
RvNN

Word
vector
This is a dog

113
Drawbacks of Recursive Neural
Networks(RvNN)
• Ambiguity in natural language

http://3rd.mafengwo.cn/travels/info_wei http://www.appledaily.com.tw/realtimen
bo.php?id=2861280 ews/article/new/20151006/705309/

114
Element-wise 1D operations on word vectors
• 1D Convolution or 1D Pooling

Represented
by
operation operation

This is a
This is a

115
From RvNN to CNN
• RvNN • CNN
Different
RvNN conv3 conv layers
Same
RvNN

RvNN conv2 conv2

RvNN conv1 conv1 conv1

This is a dog This is a dog

116
CNN with Max-Pooling Layers
• Similar to syntax tree
• But human-labeled syntax tree is not needed

conv2 Max conv2


Pooling
pool1 pool1 pool1 pool1

conv1 conv1 conv1 conv1 conv1

This is a dog This is a dog

117
Sentiment Analysis by CNN
• Use softmax layer to classify the sentiments
positive negative
softmax softmax

conv2 conv2

pool1 pool1 pool1 pool1

conv1 conv1 conv1 conv1 conv1 conv1

This movie is awesome This movie is awful

118
Sentiment Analysis by CNN
• Build the “correct syntax tree” by training
negativeerror negative
softmax softmax

conv2 conv2
Backward
propagation
pool1 pool1 pool1 pool1

conv1 conv1 conv1 conv1 conv1 conv1

This movie is awesome This movie is awesome

119
Sentiment Analysis by CNN
• Build the “correct syntax tree” by training
negative positive
softmax softmax

conv2
Update conv2
the weights
pool1 pool1 pool1 pool1

conv1 conv1 conv1 conv1 conv1 conv1

This movie is awesome This movie is awesome

120
Multiple Filters
• Richer features than RNN

filter21 filter22 Filter23

filter11 filter12 Filter13 filter11 filter12 Filter13

This is a
121
Sentence can’t be easily resized
• Image can be easily resized • Sentence can’t be easily
resized

全台灣最高樓在台北市
resize

resize
全台灣最高的高樓在台北市
全台灣最高樓在台北
台灣最高樓在台北

122
Various Input Size
• Convolutional layers and pooling layers
◦ can handle input with various size

pool1 pool1 pool1

conv1 conv1 conv1 conv1 conv1

the dog run This is a dog

123
Various Input Size
• Fully-connected layer and softmax layer
◦ need fixed-size input

softmax softmax

fc fc

The dog run This is a dog

124
k-max Pooling
• choose the k-max values
• preserve the order of input values
• variable-size input, fixed-size output

12 21 15 13 7 8

3-max 3-max
pooling pooling

12 5 21 15 7 4 9 13 4 1 7 8

125
Wide Convolution
• Ensures that all weights reach the entire sentence

conv conv conv conv conv conv conv conv

Narrow convolution Wide convolution

126
Dynamic k-max Pooling

L l
kl = max(ktop , d se)
L
l : index of current layer
kl : k of current layer ktop wide convolution
& k-max pooling
ktop : k of top layer
L : total number of layers L
s : length of input sentence kl
wide convolution &
k-max pooling

ktop and L are constants

s
127
Dynamic k-max Pooling

L l
kl = max(ktop , d se)
L

ktop = 3
conv & pooling

L=2
2 1
k1 = max(3, d ⇥ 10e) = 5 conv & pooling
2

s = 10

128
Dynamic k-max Pooling

L l
kl = max(ktop , d se)
L

ktop = 3
conv & pooling

L=2
2 1
k1 = max(3, d ⇥ 14e) = 7 conv & pooling
2

s = 14

129
Dynamic k-max Pooling

L l
kl = max(ktop , d se)
L

ktop = 3
conv & pooling

L=2
2 1
k1 = max(3, d ⇥ 8e) = 4 conv & pooling
2

s=8

130
Dynamic k-max Pooling

Wide convolution &


Dynamic k-max pooling

131
Convolutional Neural Networks for Sentence
Classification
• Paper: http://www.aclweb.org/anthology/D14-1181
• Sourcee code:
https://github.com/yoonkim/CNN_sentence

132
Static & Non-Static Channel
• Pretrained by word2vec
• Static: fix the values during training
• Non-Static: update the values during training

133
About the Lecturer

Mark Chang
HTC Research & Healthcare
Deep Learning Algorithms
Research Engineer

• Email: ckmarkoh at gmail dot com


• Blog: http://cpmarkchang.logdown.com
• Github: https://github.com/ckmarkoh
• Slideshare: http://www.slideshare.net/ckmarkohchang
• Youtube: https://www.youtube.com/channel/UCckNPGDL21aznRhl3EijRQw

134

You might also like