0% found this document useful (0 votes)
24 views7 pages

Q1

Uploaded by

Om
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views7 pages

Q1

Uploaded by

Om
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Q1 – PITS NETWORK

ANS - McCulloch and Pits Neurons

The McCulloch-Pitts neural model, which was the earliest ANN model, has only two types
of inputs — Excitatory and Inhibitory. The excitatory inputs have weights of positive
magnitude and the inhibitory weights have weights of negative magnitude. The inputs of
the McCulloch-Pitts neuron could be either 0 or 1. It has a threshold function as an
activation function. So, the output signal yout is 1 if the input ysum is greater than or equal to a
given threshold value, else 0. The diagrammatic representation of the model is as follows:

Simple McCulloch-Pitts neurons can be used to design logical operations. For that purpose,
the connection weights need to be correctly decided along with the threshold function
(rather than the threshold value of the activation function). For better understanding
purpose, let me consider an example:
John carries an umbrella if it is sunny or if it is raining. There are four given situations. I
need to decide when John will carry the umbrella. The situations are as follows:
 First scenario: It is not raining, nor it is sunny
 Second scenario: It is not raining, but it is sunny
 Third scenario: It is raining, and it is not sunny
 Fourth scenario: It is raining as well as it is sunny
To analyse the situations using the McCulloch-Pitts neural model, I can consider the input
signals as follows:
 X1: Is it raining?
 X2 : Is it sunny?
So, the value of both scenarios can be either 0 or 1. We can use the value of both weights
X1 and X2 as 1 and a threshold function as 1. So, the neural network model will look like:
So, I can say that, The truth table built with respect to the problem is depicted above. From
the truth table, I can conclude that in the situations where the value of yout is 1, John needs
to carry an umbrella. Hence, he will need to carry an umbrella in scenarios 2, 3 and 4.

Q2 – GRADIANT DECENT AND DELTA RULE


ANS - Gradient Decent and Delta Rule

Why Delta Rule?


The perceptron training rule is guaranteed to converge if and only if the training examples
are linearly separable. However, there are cases that the pereptron training rule would simply
fail! More specifically, if the data are not linearly separable, the perceptron training rule will
simply NOT converge! This is when the Delta rule comes to the rescue.

Gradient descent is a way to find a minimum in a high-dimensional space. You go in


direction of the steepest descent.

The delta rule is an update rule for single layer perceptrons. It makes use of gradient descent.

Backpropagation is an efficient implementation of gradient descent, where a rule can be


formulated which has some recursively defined parts. Those parts belong to neurons of
different layers and get calculated from the output-layer (last layer) to the first hidden layer.

Delta rule in neural networks that is used to updated weights during training a Neural
Network.

Introduction
The delta rule is a formula for updating the weights of a neural network during training. It is
considered a special case of the backpropagation algorithm. The delta rule is in fact a gradient
descent learning rule.
Recall that the process of training a neural network involves iterating over the following
steps:

 A set of input and output sample pairs are selected randomly and run through the neural
network. The network makes predictions on these samples.

 The loss between the predictions and the true values is computed.

 Adjust the weights in a direction that makes the loss smaller.


The delta rule is one algorithm that can be used repeatedly during training to modify the
network weights to reduce loss error.

Application
The generalized delta rule is important in creating useful networks capable of learning
complex relations between inputs and outputs. Compared to other early learning rules like the
Hebbian learning rule or the Correlation learning rule, the delta rule has the advantage that it
is mathematically derived and directly applicable to Supervised Learning. In addition, unlike
the Perceptron learning rule which relies on the use of the Heaviside step function as the
activation function which means that the derivative does not exist at zero, and is equal to zero
elsewhere, the delta rule is applicable to differentiable activation functions like tanh and the
sigmoid function.

Q3- MULTILAYER PERCEPTRON ALGO ( DRAW AND


EXPLAIN )
ANS - Multi-Layered Perceptron Model:

Like a single-layer perceptron model, a multi-layer perceptron model also has the same
model structure but has a greater number of hidden layers.

The multi-layer perceptron model is also known as the Backpropagation algorithm, which
executes in two stages as follows:

o Forward Stage: Activation functions start from the input layer in the forward stage
and terminate on the output layer.
o Backward Stage: In the backward stage, weight and bias values are modified as per
the model's requirement. In this stage, the error between actual output and demanded
originated backward on the output layer and ended on the input layer.

Hence, a multi-layered perceptron model has considered as multiple artificial neural networks
having various layers in which activation function does not remain linear, similar to a single
layer perceptron model. Instead of linear, activation function can be executed as sigmoid,
TanH, ReLU, etc., for deployment.

A multi-layer perceptron model has greater processing power and can process linear and non-
linear patterns. Further, it can also implement logic gates such as AND, OR, XOR, NAND,
NOT, XNOR, NOR.

Q4- NOTE ON DIFFERENTIABLE THREASHOLD UNIT


ANS - A Differentiable Threshold Unit: (also Refer ChatGpt information)
A Differentiable Threshold Unit (DTU) is a concept in neural network
architectures designed to address the need for non-linearity while maintaining
differentiability throughout the training process. Here’s a detailed explanation:

Definition and Purpose:


A Differentiable Threshold Unit (DTU) serves as an alternative to traditional
activation functions (like ReLU, sigmoid, tanh) in neural networks. It introduces a
differentiable threshold operation that can be smoothly integrated into the
gradient-based optimization process used during training.
Characteristics:
Threshold Activation: The key feature of a DTU is its ability to apply a threshold
operation. This operation typically activates the neuron when the input value
exceeds a certain threshold.
Differentiability: Unlike traditional step functions that are not differentiable at the
threshold (leading to gradient issues during backpropagation), a DTU is designed
to be differentiable everywhere, including at the threshold point. This ensures
smooth gradient flow throughout the network, facilitating effective training via
gradient descent methods.
Advantages:
Gradient Stability: Enables stable and efficient gradient-based optimization
during training, avoiding the issues associated with non-differentiable step
functions.
Non-linearity: Introduces non-linearity to the neural network, enhancing its
capacity to model complex relationships in data.
Interpretability: The threshold parameter θ (theta) can be interpreted as a
decision boundary, making the DTU particularly useful in applications where
interpretability of neural network decisions is important.
Applications:
Classification Tasks: DTUs can be used effectively in classification tasks where
clear decision boundaries are required.
Neural Network Architectures: They can be integrated into various neural
network architectures, including feedforward networks, convolutional neural
networks (CNNs), and recurrent neural networks (RNNs).
Implementation:
Mathematical Formulation: Implementing a DTU involves defining a threshold
operation and choosing an appropriate differentiable activation function (e.g.,
sigmoid, tanh) to smooth out the thresholding operation.
Training: DTUs are trained using standard gradient-based optimization
techniques such as stochastic gradient descent (SGD), where gradients are
computed efficiently due to the function’s differentiability.
In essence, a Differentiable Threshold Unit (DTU) offers a way to introduce
thresholding behavior into neural networks while maintaining the ability to
compute gradients throughout the network, ensuring smooth and effective
training. This makes it a valuable tool in modern machine learning architectures,
balancing non-linearity with computational efficiency.

Q5 – BACK PROPAGATION ALGO IN DETAIL


ANS - What is backpropagation?
 In machine learning, backpropagation is an effective algorithm used to
train artificial neural networks, especially in feed-forward neural networks.
 Backpropagation is an iterative algorithm, that helps to minimize the cost
function by determining which weights and biases should be adjusted.
During every epoch, the model learns by adapting the weights and biases
to minimize the loss by moving down toward the gradient of the error.
Thus, it involves the two most popular optimization algorithms, such as
gradient descent or stochastic gradient descent.

 Computing the gradient in the backpropagation algorithm helps to


minimize the cost function and it can be implemented by using the
mathematical rule called chain rule from calculus to navigate through
complex layers of the neural network.

Advantages of Using the Backpropagation Algorithm in


Neural Networks
Backpropagation, a fundamental algorithm in training neural networks, offers several
advantages that make it a preferred choice for many machine learning tasks. Here, we discuss
some key advantages of using the backpropagation algorithm:

1. Ease of Implementation: Backpropagation does not require prior


knowledge of neural networks, making it accessible to beginners. Its
straightforward nature simplifies the programming process, as it primarily
involves adjusting weights based on error derivatives.

2. Simplicity and Flexibility: The algorithm’s simplicity allows it to be


applied to a wide range of problems and network architectures. Its
flexibility makes it suitable for various scenarios, from simple feedforward
networks to complex recurrent or convolutional neural networks.

3. Efficiency: Backpropagation accelerates the learning process by directly


updating weights based on the calculated error derivatives. This efficiency
is particularly advantageous in training deep neural networks, where
learning features of a function can be time-consuming.

4. Generalization: Backpropagation enables neural networks to generalize


well to unseen data by iteratively adjusting weights during training. This
generalization ability is crucial for developing models that can make
accurate predictions on new, unseen examples.

5. Scalability: Backpropagation scales well with the size of the dataset and
the complexity of the network. This scalability makes it suitable for large-
scale machine learning tasks, where training data and network size are
significant factors.

In conclusion, the backpropagation algorithm offers several advantages that contribute to its
widespread use in training neural networks. Its ease of implementation, simplicity, efficiency,
generalization ability, and scalability make it a valuable tool for developing and training
neural network models for various machine learning applications.

You might also like