On Convolutional Neural Network: Zeng Huang

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

On Convolutional Neural

Network
Zeng Huang
Brain-like Computing and Machine Intelligence Lab
Shanghai Jiao Tong University (China)
Interests: Vision, Graphics, Machine Learning
http://zeng.photography/
Outline
• Basic concepts on Image classification
• Ideas behind Convolutional Neural Network(CNN)
• Technical details & tricks applying to CNN
• Observations from experiments so far

09/27/2020 2
The Problem
• Each image belongs to one of several categories, such as ‘the
image contains a cat’ or ‘the image represents a traffic jam’.
• For training data, each image would assign a label indicating
the category.
• For test data, the goal is to classify each image to one of these
categories.
• More precisely, the goal is to determine the probability each
image belongs to each of the categories.

09/27/2020 3
The MNIST Datasets
•  Introduced by Yann LeCun
• Handwritten digits, 0 to 9
• Each image is labeled using
its representing digit
• pixels, digit centered at the
image
• 60,000 training data, 10,000
test data
16 test data images
shown in a 4-by-4 grid

09/27/2020 4
What is convolution?
•  Definition

• Seen as an average of at the


moment weighted by .

09/27/2020 5
Example: Gaussian Filter
•  Often we call in image
processing a ‘filter’ or ‘kernel’.
• A Gaussian filter is a matrix
whose value is distributed
under Gaussian.
• The closer to the center, the
greater the value is.
• Usually we normalize sum of
values to one. A 9-by-9 Gaussian kernel
The center has the greatest value

09/27/2020 6
Example: Gaussian Filter

Gaussian filter acts as a smoothness filter


09/27/2020 7
Example: Sober Filter

 1 0 −1
( 2
1
0
0
−2
−1 )
¿ 
  1 2 1
( 0
−1
0
−2
0
−1 )
Sober filter acts as an edge detector
09/27/2020 8
Convolution in Neural Network
• A neural cell is connected only
to those pixels in a certain
region.
• All weights for cells on the same
layer is the same.
• The weights can hence be
viewed as values for a
convolutional kernel.
• We wish CNN learns the kernels
that do extract some features of
the image.
09/27/2020 9
Comparison with NN
 Suppose we have pixels in the first
layer, and cells in the second layer.
CNN uses a k-by-k kernel between
these two layers.
Upside: NN connection • There are weights in NN, but only
weights in CNN.
• But is enough?
• All weights in CNN together detect a
global feature of its input.
Downside: CNN connection

09/27/2020 10
Comparison with NN
4(a): Write code to train a • In this scenario, all CNN gates
convolution network. Use have shared 25 weights, while
10×10 input, first level has 36, an NN gate has 100 weights.
5×5 convolution gates. Try • But is 25 for all versus 100 for
tying weights in all the each enough?
convolution gates together so • Nope, since in CNN, gates
you are only training 25 with 25 shared weights
weights in this level. Also try actually learn the same thing,
letting gates learn different only applied to different
weights. regions.

09/27/2020 11
Feature Maps for CNN
• Corresponding
  to one fully-
connected cell in NN, a map of
cells in CNN detect a global
feature of its Input.
• If we have cells in NN, we should
have maps of cells in CNN.
• But often has not to be as big as ,
hence CNN has much fewer
weights to train.
• When we have many maps, we To achieve some ‘correspondence’,
we have 4 cells in layer 2 of NN,
can convolute across several maps but 4 maps of cells in that of CNN.
to put thing back. However the weights for CNN is
09/27/2020 still fewer than those for NN. 12
Height of CNN Network
• We
  have seen that CNN has
fewer weights to train for each
one layer.
• But wait: the convolutional
way outputs the upper layer
pixels shorter in width, causing
the height of network to grow If we do nothing other than convolution, the
linearly of image scale. height of the network will grow linearly of the
• Solution: reduce scale of input scale, causing too many connections
and weights to train.
intermediate results.

09/27/2020 13
Subsampling and Pooling
• At certain stage of CNN, we can
reduce scale of intermediate
layers by subsampling or
pooling.
• Recall that adjacent cells
represent features of adjacent
regions, it is reasonable to
sample them to a single cell.
• This cause the height of
network to be logistic of image
scale.
09/27/2020 14
Overview of a CNN structure

A practical CNN structure is interleaving of convolution steps and sub-sampling steps.

A visualization

09/27/2020 15
Compare performance between convolution
and fully-connected
• This can be done simply by using convolutional kernels as big
as the original image.
• But this causes the height of network to be three, since only one
hidden layer is convoluted.
• The problem lays in: if we want to compare, then what property
should we keep unchanged?
• Total number of gates? Total number of weights? Network
height?
• Some of these cannot be kept together.

09/27/2020 16
Is it rather tricky or techy?
• When should a CNN perform subsampling?
• What size of kernel should it use in each stage?
• The choice for parameters seems tricky in this area, but maybe
we can count the results and find some interesting stories in it.
• And another strategy is to start from small things. But I always
doubt if model for rectangles are suitable for deep learning.
• Also, it is of high value that somehow we see what the network
is truly done. This can be achieved by visualization or statistics.

09/27/2020 17
Tools & Ref.s I Got for Experiments
• MNIST
• Convolutional Neural Networks (LeNet) (Python)
• DeepLearnToolbox (Matlab)
• Visualization is the best thing for CV.
• 3D convolutional network visualization (css & Js & Json)
• A visualization for fully-connected network. Might help Qizhe.

09/27/2020 18
Thank you.
• If you have any questions, please feel free to ask.
• My e-mail is frequencyhzs@gmail.com, I’ll send you all
materials after class.

09/27/2020 19

You might also like