Ker As Tutorial
Ker As Tutorial
Ker As Tutorial
https://www.slideshare.net/0xdata/deep‐learning‐with‐mxnet‐dmitry‐larko
Overview of the tutorial
• What is Keras ?
• Basics of Keras environment
• Building Convolutional neural networks
• Building Recurrent neural networks
• Introduction to other types of layers
• Introduction to Loss functions and Optimizers in Keras
• Using Pre-trained models in Keras
• Saving and loading weights and models
• Popular architectures in Deep Learning
What is Keras ?
• Deep neural network library in Python
• High-level neural networks API
• Modular – Building model is just stacking layers and connecting computational
graphs
• Runs on top of either TensorFlow or Theano or CNTK
• Why use Keras ?
• Useful for fast prototyping, ignoring the details of implementing backprop or
writing optimization procedure
• Supports Convolution, Recurrent layer and combination of both.
• Runs seamlessly on CPU and GPU
• Almost any architecture can be designed using this framework
• Open Source code – Large community support
Working principle - Backend
• Computational Graphs
• Expressing complex expressions as
a combination of simple operations
• Useful for calculating derivatives
during backpropagation
• Easier to implement distributed
computation
• Just specify the inputs, outputs and
make sure the graph is connected e = c*d
where, “c = a+b” and “d = b+1”
So, e = (a+b)*(b+1)
Here “a” ,“b” are inputs
http://colah.github.io/posts/2015‐08‐Backprop/
General pipeline for implementing an ANN
• Design and define the neural network architecture
Define the
Prepare Input ANN model Optimizers Loss function Train and
(Images, videos, (Sequential or (SGD, RMSprop, (MSE, Cross evaluate the
text, audio) Functional style) Adam) entropy, Hinge) model
(MLP, CNN, RNN)
Procedure to implement an ANN in Keras
• Importing Sequential class from keras.models
[1] https://blog.heuritech.com/2016/02/29/a‐brief‐report‐of‐the‐heuritech‐deep‐learning‐meetup‐5/vgg16/
[2] https://www.cc.gatech.edu/~hays/7476/projects/Avery_Wenchen/
Keras models – Functional
• Functional Model
• Multi – input and Multi –
output models
• Complex models which forks
into 2 or more branches
• Models with shared (Weights)
layers
[1] https://www.sciencedirect.com/science/article/pii/S0263224117304517
[2] Unsupervised Domain Adaptation by Backpropagation, https://arxiv.org/abs/1409.7495
Keras models – Functional
(Domain Adaption)
• Train on Domain A and Test on Domain B
• Results in poor performance on test set
• The data are from different domains
• Solution: Adapt the model to both the domains
Domain A Domain B
With Labels Without Labels
[1] https://www.sciencedirect.com/science/article/pii/S0263224117304517
[2] Unsupervised Domain Adaptation by Backpropagation, https://arxiv.org/abs/1409.7495
Convolution neural network - Sequential model
• Mini VGG style network • Height – height of the image
• Width – Width of the image
• FC – Fully Connected Input
4D array
• channels – Number of channels
layers (dense layer) Conv ‐ 32
• For RGB image, channels = 3
• Input dimension – 4D • For gray scale image, channels = 1 Conv ‐ 32
• [N_Train, height, width, channels] Maxpool
• N_train – Number of train Conv ‐ 64
samples Conv ‐ 64
Maxpool
FC ‐ 256
FC ‐ 10
Input
4D array
Conv ‐ 32
Conv ‐ 32
Maxpool
Conv ‐ 64
Conv ‐ 64
Maxpool
FC ‐ 256
FC ‐ 10
Simple MLP network - Functional model
• Import class called “Model”
• Each layer explicitly
returns a tensor
• Pass the returned tensor to
the next layer as input
• Explicitly mention model
inputs and outputs
Recurrent Neural Networks
• RNNs are used on sequential data –
Text, Audio, Genomes etc.
• Recurrent networks are of three types
• Vanilla RNN
• LSTM
• GRU
• They are feedforward networks with
internal feedback
• The output at time “t” is dependent on
current input and previous values
https://towardsdatascience.com/sentiment‐analysis‐using‐rnns‐lstm‐60871fa6aeba
Recurrent Neural Network
Dense
Convolution layers
• 1D Conv
keras.layers.convolutional.Conv1D(filters, kernel_size, strides=1, padding='valid', dilation_rate=1,
activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None,
bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, bias_constraint=None)
Applications: Audio signal processing, Natural language processing
• 2D Conv
keras.layers.convolutional.Conv2D(filters, kernel_size, strides=(1, 1), padding='valid', data_format=None,
dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros',
kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
bias_constraint=None)
Applications: Computer vision ‐ Images
• 3D Conv
keras.layers.convolutional.Conv3D(filters, kernel_size, strides=(1, 1, 1), padding='valid', data_format=None,
dilation_rate=(1, 1, 1), activation=None, use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros',
kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
bias_constraint=None)
Applications: Computer vision – Videos (Convolution along temporal dimension)
Pooling layers
• Max pool
keras.layers.pooling.MaxPooling2D(pool_size=(2, 2), strides=None, padding='valid’)
• Average pool
keras.layers.pooling.AveragePooling2D(pool_size=(2, 2), strides=None, padding='valid') Up sampling
• Up sampling
keras.layers.convolutional.UpSampling2D(size=(2, 2))
General layers
• Dense
keras.layers.core.Dense(units, activation=None, use_bias=True,
kernel_initializer='glorot_uniform', bias_initializer='zeros', kernel_regularizer=None,
bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,
bias_constraint=None)
• Dropout
keras.layers.core.Dropout(rate, noise_shape=None, seed=None)
• Embedding
keras.layers.embeddings.Embedding(input_dim, output_dim, input_length=None
embeddings_initializer='uniform', embeddings_regularizer=None, activity_regularizer=None,
embeddings_constraint=None, mask_zero=False)
Optimizers available in Keras
• How do we find the “best set of parameters (weights and biases)” for the
given network ?
• Optimization
• They vary in the speed of convergence, ability to avoid getting stuck in local minima
• SGD – Stochastic gradient descent
• SGD with momentum
• Adam
• AdaGrad
• RMSprop
• AdaDelta
• Detailed explanation of each optimizer is given in the “Deep learning book”
• URL: http://www.deeplearningbook.org/contents/optimization.html
Loss functions available in Keras
• MSE – Mean square error • Categorical cross entropy – “K”
number of classes
• VGG - 2014
[1] AlexNet, https://papers.nips.cc/paper/4824‐imagenet‐classification‐with‐deep‐convolutional‐neural‐networks.pdf
[2] VGG Net, https://arxiv.org/pdf/1409.1556.pdf
Image recognition networks
• ResNet – 2015 (residual connections)
[1] ResNet, https://arxiv.org/abs/1512.03385
[2] DenseNet, https://arxiv.org/abs/1608.06993
Performance of the recognition networks
Autoencoders
Output
• Unsupervised representation learning
• Dimensionality reduction
• Denoising
Input
https://www.researchgate.net/figure/Figure‐9‐A‐autoencoder‐with‐many‐hidden‐layers‐two‐stacked‐autoencoders_282997080_fig9
Generative Adversarial Network
https://indico.lal.in2p3.fr/event/3487/?view=standard_inline_minutes
Interesting Applications using GANs
• Generate images from
textual description
• Performing arithmetic
in latent space
[1] Stack GAN, https://arxiv.org/abs/1612.03242
[2] DC GAN, https://arxiv.org/abs/1511.06434
Interesting Applications
using GANs
• Generate images of the same scene with different
weather conditions
• Transfer the style of painting from one image to other
• Change the content in the image
[1] UNIT, https://arxiv.org/pdf/1703.00848
[2] Cyclic GAN, https://arxiv.org/abs/1703.10593
Community contributed layers and other
functionalities
https://github.com/farizrahman4u/keras‐contrib/tree/master/keras_contrib
https://github.com/fchollet/keras/tree/master/keras/layers
Keras Documentation – keras.io
Keras Blog ‐ https://blog.keras.io/index.html
Questions ?