0% found this document useful (0 votes)

43 views14 pages

Presentation 1

The softmax function is used for classification problems to output class probabilities. It takes the outputs for each class and squeezes them between 0 and 1, then divides each by the sum of all outputs to give the probability of an input belonging to each class. Backpropagation is used to minimize the loss function and optimize weights by propagating the error backwards through the network. It calculates the partial derivative of the error with respect to each weight to determine how much changing that weight would affect the total error, then updates the weights to reduce error. Repeating this process iteratively trains the network to better map inputs to the correct outputs.

Uploaded by

Megha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views14 pages

Presentation 1

Uploaded by

Megha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 14

Softmax

• The softmax function is also a type of sigmoid function but is useful when we are trying to handle classification problems.
• The sigmoid function is able to handle just two classes. The softmax function would squeeze the outputs for each class
between 0 and 1 and would also divide by the sum of the outputs.
• This essentially gives the probability of the input being in a particular class. It can be defined as –

• Let’s say for example we have the outputs as-

• [1.2 , 0.9 , 0.75], When we apply the softmax function we would get [0.42 , 0.31, 0.27]. So now we can use these as
probabilities for the value to be in each class.
• The softmax function is ideally used in the output layer of the classifier where we are actually trying to attain the
probabilities to define the class of each input.
• And for regression we can use linear function at output layer.
Understand Back-Propagation
• The goal of back-propagation is to reduce the loss function/error of the model and optimize the weights so that the neural
network can learn how to correctly map arbitrary inputs to outputs.
• The current error is typically propagated backwards to a previous layer, where it is used to modify the weights and bias in
such a way that the error is minimized.
• To understand back-propagation , first we need to understand the working of neural network. So, we take one e.g
• In order to have some numbers to work with, here are the initial weights, the biases, and training inputs/outputs:

• Here inputs, i1 and i2 are 0.05 and 0.10 respectively.

• Weights for w1 to w8 are 0.15,0.20,0.25,0.30,0.40,0.45,0.50,0.55 respectively.

• Bias for b1 and b2 are 0.35 and 0.60.

• And expected output value of the model o1 and o2 is 0.01 and 0.99.
The Forward Pass

• In this stage neural network try to predict the expected output value with the initial input values i1,i2, and weights and
biases given above.

• For this we take total input values, respective weight, bias and apply the activation function on it.

• Here we use Sigmoid function .

• This how we calculate the total input value for first hidden layer neuron h1 is,
net_h1 = w1 * i1 + w2 * i2 + b1 * 1
net_h1 = 0.15 * 0.05 + 0.2 * 0.1 + 0.35 * 1 = 0.3775

• Then we applied sigmoid function to get output of h1,

out_h1 = 1\1+e^(-net_h1) = 1\1+e^(-0.3775) = 0.593269992

• We perform the same process for h2 also,

Outh2 = 0.596884378
• We repeat the same process for the output layer also, so get o1,

net_o1 = w5 * out_h1 + w6 * out_h2 + b2 * 1

net_o1 = 0.4 * 0.593269992 + 0.45 * 0.596884378 + 0.6 * 1 = 1.105905967

out_o1 = 1\1+e^(-net_o1) = 1\1+e^(-1.105905967) = 0.75136507

• We also find the o2 by applying same process,

net_o2 = 0.772928465
Calculating the total error
• We calculate the error for each output neuron using the squared error function and sum them to get the total error:

E_total = ∑1/2(target - output)^2

• The target output for o_1 is 0.01 but the neural network output 0.75136507, therefore its error is:

E_o1 = 1\2(target_o1 - out_o1)^2 = 1\2(0.01 - 0.75136507)^2 = 0.274811083

• We repeat same process for finding error of o2,

E_o2 = 0.023560026

• So, the total error of neural network is the sum of both the errors,

• E_total = E_o1+E_o2= 0.274811083+0.023560026 = 0.298371109

The Backwards Pass
• Our goal with back propagation is to update each of the weights in the network so that they cause the actual
output to be closer the target output, and minimizing the error for each output neuron and the network as a
whole.

Output Layer

• Consider w5. We want to know how much a change in w5 affects the total error, aka

• This known as partial derivation of E_total with respect to w5.

• By applying chain rule we can perform this derivation is in this way,

• Visually, here’s what we’re doing:

• Now we need to find out the value of the each equation,
• First how much the total error change with respect to output,

• When we take the partial derivative of the total error with respect to out_o1, the quantity 1\2(target_o2 - out_o2)^2 becomes
zero because out_o1 does not affect it which means we’re taking the derivative of a constant which is zero.

• Next, how much does the output of o1 change with respect to its total net input. So, the partial derivative of the logistic
function is the output multiplied by 1 minus the output:

• Finally, how much does the total net input of o1 change with respect to w5
• Now we will arrange all values together,

• To decrease the error, we then subtract this value from the current weight (optionally multiplied by some learning rate, eta,
which we’ll set to 0.5):

• We repeat this process to get the new weights for w6, w7, and w8:
Hidden Layer
• We continue the backwards pass by calculating new values for w1, w2, w3, and w4.
• So, for that we need to calculate,

• Visually, we are doing

• We’re going to use a similar process as we did for the output layer, but slightly different to account for the fact that the
output of each hidden layer neuron contributes to the output (and therefore error) of multiple output neurons.
• We know that out_h1 affects both out_o1 and out_o2 therefore the needs to take into consideration its effect on
the both output neurons:

• Starting with,

• We can calculate using values we calculated earlier:

• And is equal to w5:

• Bring all the values together,

• Following the same process for we get:

• So,

• Now that we have , we need to figure out and then for each weight:
• We calculate the partial derivative of the total net input to h1 with respect to w1 the same as we did for the output neuron:

• Putting all together,

• We can now update w1:

• Repeating this for w2, w3, and w4

• Finally, we’ve updated all of our weights in the model.

• When we fed forward the 0.05 and 0.1 inputs originally, the error on the network was 0.298371109.

• After this first round of back propagation, the total error is now down to 0.291027924.

• It might not seem like much, but after repeating this process 10,000 times, for example, the error plummets to
0.0000351085.

• At that point, when we feed forward 0.05 and 0.1, the two outputs neurons generate 0.015912196 (vs 0.01 target) and
0.984065734 (vs 0.99 target).

Apache Flume Distributed Log Collection For Hadoop 2nd Steve Hoffman download
No ratings yet
Apache Flume Distributed Log Collection For Hadoop 2nd Steve Hoffman download
52 pages
آن أفونليا 2
No ratings yet
آن أفونليا 2
403 pages
CS460 - Deep Learning - W02 & W03
No ratings yet
CS460 - Deep Learning - W02 & W03
44 pages
Designing an Operational Amplifier
No ratings yet
Designing an Operational Amplifier
21 pages
Viscosity Conversion Chart
100% (1)
Viscosity Conversion Chart
8 pages
ADET Midterm Exam
No ratings yet
ADET Midterm Exam
3 pages
ML807_Distributed_and_Federated_Learning_Slides_2
No ratings yet
ML807_Distributed_and_Federated_Learning_Slides_2
211 pages
15 Patterns To Ace Any
No ratings yet
15 Patterns To Ace Any
21 pages
36-Multi-Layer Perceptron and Its Properties-30-10-2024
No ratings yet
36-Multi-Layer Perceptron and Its Properties-30-10-2024
39 pages
E510 Startup Installation Manual
No ratings yet
E510 Startup Installation Manual
147 pages
2.1-WSCL MAX Tech 04111396380584
No ratings yet
2.1-WSCL MAX Tech 04111396380584
37 pages
Computer Form 1 Final
No ratings yet
Computer Form 1 Final
78 pages
NN Lecture Notes
No ratings yet
NN Lecture Notes
45 pages
Neural Network
100% (1)
Neural Network
54 pages
CIS RAM v2.1 For IG1 Workbook.22.05
No ratings yet
CIS RAM v2.1 For IG1 Workbook.22.05
174 pages
QM031170 Manual
100% (4)
QM031170 Manual
221 pages
A Step by Step Forward Pass and Backpropagation Example
No ratings yet
A Step by Step Forward Pass and Backpropagation Example
14 pages
Upload the Panorama Virtual Appliance Image to OCI
No ratings yet
Upload the Panorama Virtual Appliance Image to OCI
4 pages
ANN calculations
No ratings yet
ANN calculations
24 pages
MLP Numerical
No ratings yet
MLP Numerical
19 pages
Name Date : Paper Number Age Group
No ratings yet
Name Date : Paper Number Age Group
8 pages
Artificial Neural Networks (ANN) : Dr.M.Sivagnanasundaram
No ratings yet
Artificial Neural Networks (ANN) : Dr.M.Sivagnanasundaram
18 pages
DL U-I Introduction Part-2
No ratings yet
DL U-I Introduction Part-2
48 pages
7-Working example-01-08-2024 (1)
No ratings yet
7-Working example-01-08-2024 (1)
29 pages
Module 5 Lecture 2
No ratings yet
Module 5 Lecture 2
45 pages
Ann
No ratings yet
Ann
31 pages
Foundations of Machine Learning: Module 6: Neural Network
No ratings yet
Foundations of Machine Learning: Module 6: Neural Network
68 pages
ANN-Implemetation of Back-Prop
No ratings yet
ANN-Implemetation of Back-Prop
89 pages
MLP(Backward propagation) (1)
No ratings yet
MLP(Backward propagation) (1)
16 pages
ANN_example
No ratings yet
ANN_example
10 pages
Communications Adapter Cable Requirements
No ratings yet
Communications Adapter Cable Requirements
10 pages
104 Unit ONE - Getting Through
100% (2)
104 Unit ONE - Getting Through
58 pages
PT Maruka Indonesia - Quotation Catmee - 03-Sept-2024
No ratings yet
PT Maruka Indonesia - Quotation Catmee - 03-Sept-2024
4 pages
SRS File (Precurement Mangament System)
No ratings yet
SRS File (Precurement Mangament System)
18 pages
Neural Network (Perceptrons)
No ratings yet
Neural Network (Perceptrons)
31 pages
Neural Networks - Learning
No ratings yet
Neural Networks - Learning
26 pages
Schem SPI Enhanced Report Utility Users Guide
No ratings yet
Schem SPI Enhanced Report Utility Users Guide
158 pages
Divisibility Rules
No ratings yet
Divisibility Rules
11 pages
Neural
No ratings yet
Neural
53 pages
Unit 2
No ratings yet
Unit 2
38 pages
Backpropagation Algorithm: Example: Prof. Navneet Goyal
No ratings yet
Backpropagation Algorithm: Example: Prof. Navneet Goyal
19 pages
Pr2_ANN_WriteUp.docx
No ratings yet
Pr2_ANN_WriteUp.docx
11 pages
A Step by Step Backpropagation
No ratings yet
A Step by Step Backpropagation
8 pages
Week 2
No ratings yet
Week 2
17 pages
Online Jobs
No ratings yet
Online Jobs
5 pages
Classification BP Regression KNN Other Classifiers_ Final.ppt
No ratings yet
Classification BP Regression KNN Other Classifiers_ Final.ppt
116 pages
Understanding Backpropagation Algorithm - Towards Data Science
No ratings yet
Understanding Backpropagation Algorithm - Towards Data Science
11 pages
How To Build Your Own Neural Network From Scratch in
No ratings yet
How To Build Your Own Neural Network From Scratch in
6 pages
NeuralNetworks
No ratings yet
NeuralNetworks
29 pages
Lecture 40,41 BP Algorithm
No ratings yet
Lecture 40,41 BP Algorithm
11 pages
Study Material
No ratings yet
Study Material
4 pages
05 - Mini Project Report
No ratings yet
05 - Mini Project Report
12 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
26 pages
Back propogation
No ratings yet
Back propogation
9 pages
Step by Step Back Propagation
No ratings yet
Step by Step Back Propagation
8 pages
Backpropagation in Neural Nets
No ratings yet
Backpropagation in Neural Nets
13 pages
Mvsquest Second Edition Tools
No ratings yet
Mvsquest Second Edition Tools
3 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
17 pages
Module 3.Docxaiml
No ratings yet
Module 3.Docxaiml
20 pages
Ai Assignment 2 Answer
No ratings yet
Ai Assignment 2 Answer
12 pages
Inspection Data Management System
100% (1)
Inspection Data Management System
31 pages
A Step by Step Backpropagation Example - Matt Mazur
No ratings yet
A Step by Step Backpropagation Example - Matt Mazur
9 pages
Backpropagation (Numericals) SOLVED NEW
No ratings yet
Backpropagation (Numericals) SOLVED NEW
8 pages
Chapter3 - BP
No ratings yet
Chapter3 - BP
12 pages
0111CS191028
No ratings yet
0111CS191028
4 pages
Ch2_ANN_BB
No ratings yet
Ch2_ANN_BB
16 pages
Fusion Architecture and Overview
No ratings yet
Fusion Architecture and Overview
4 pages
Activation Function To Back Pro
No ratings yet
Activation Function To Back Pro
22 pages
L4deep Learning
No ratings yet
L4deep Learning
14 pages
Types of MAC Protocols
No ratings yet
Types of MAC Protocols
16 pages
Back Propagation
No ratings yet
Back Propagation
29 pages
Artificial Neural Networks - Lect - 3
No ratings yet
Artificial Neural Networks - Lect - 3
16 pages
A Step by Step Backpropagation Example
No ratings yet
A Step by Step Backpropagation Example
9 pages
Back-Propagation Is Very Simple. Who Made It Complicated
No ratings yet
Back-Propagation Is Very Simple. Who Made It Complicated
26 pages
Lecture 13.3 Classification ANN
No ratings yet
Lecture 13.3 Classification ANN
64 pages
Course Content - Enterprise Resource Planning (ERP)
No ratings yet
Course Content - Enterprise Resource Planning (ERP)
3 pages
Neural Net 3rdclass
No ratings yet
Neural Net 3rdclass
35 pages
Micro Controller Controlled Servo Motor With Keypad Input
No ratings yet
Micro Controller Controlled Servo Motor With Keypad Input
3 pages
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
No ratings yet
Advanced Information Retreival: Chapter 02: Modeling - Neural Network Model
31 pages
Exp 3
No ratings yet
Exp 3
9 pages
Unit 4
No ratings yet
Unit 4
16 pages
An Introduction To Mathematics Behind Neural Networks
No ratings yet
An Introduction To Mathematics Behind Neural Networks
5 pages
Backpropagation
No ratings yet
Backpropagation
12 pages
Backpropagation Example
No ratings yet
Backpropagation Example
9 pages
Hard Copy of Faculty Feedback System
83% (6)
Hard Copy of Faculty Feedback System
16 pages
RMA Scoresheetsv2
No ratings yet
RMA Scoresheetsv2
13 pages
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Indian Companies in Malaysia
50% (6)
Indian Companies in Malaysia
17 pages
Exercises of Derivatives
From Everand
Exercises of Derivatives
Simone Malacrida
No ratings yet
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet

Presentation 1

Uploaded by

Presentation 1

Uploaded by

Softmax

• Let’s say for example we have the outputs as-

• Here inputs, i1 and i2 are 0.05 and 0.10 respectively.

• Weights for w1 to w8 are 0.15,0.20,0.25,0.30,0.40,0.45,0.50,0.55 respectively.

• Bias for b1 and b2 are 0.35 and 0.60.

• Here we use Sigmoid function .

• Then we applied sigmoid function to get output of h1,

out_h1 = 1\1+e^(-net_h1) = 1\1+e^(-0.3775) = 0.593269992

• We perform the same process for h2 also,

net_o1 = w5 * out_h1 + w6 * out_h2 + b2 * 1

net_o1 = 0.4 * 0.593269992 + 0.45 * 0.596884378 + 0.6 * 1 = 1.105905967

out_o1 = 1\1+e^(-net_o1) = 1\1+e^(-1.105905967) = 0.75136507

• We also find the o2 by applying same process,

E_total = ∑1/2(target - output)^2

E_o1 = 1\2(target_o1 - out_o1)^2 = 1\2(0.01 - 0.75136507)^2 = 0.274811083

• We repeat same process for finding error of o2,

• E_total = E_o1+E_o2= 0.274811083+0.023560026 = 0.298371109

• This known as partial derivation of E_total with respect to w5.

• By applying chain rule we can perform this derivation is in this way,

• Visually, here’s what we’re doing:

• Visually, we are doing

• We can calculate using values we calculated earlier:

• And is equal to w5:

• Following the same process for we get:

• Putting all together,

• We can now update w1:

• Repeating this for w2, w3, and w4

You might also like