ML (Cs-601) Unit 4 Complete
ML (Cs-601) Unit 4 Complete
ML (Cs-601) Unit 4 Complete
UNIT 4
Recurrent Neural Network (RNN)
•A recurrent neural network (RNN) is any network whose neurons
send feedback signals to each other.
•A Recurrent Neural Network is a type of neural network that
contains loops, allowing information to be stored within the network.
•In order to achieve it, the RNN creates the networks with loops in
them, which allows it to persist the information.
This loop structure allows the neural network to take the sequence of
input.
•Thus RNN came into existence, which solved this issue with the help
of a Hidden Layer. The main and most important feature of RNN is
Hidden state, which remembers some information about a sequence.
•RNN have a “memory” which remembers all information about
what has been calculated. It uses the same parameters for each input as
it performs the same task on all the inputs or hidden layers to produce
the output.
• The long term memory refers to the learned weights and the short
term memory refers to the gated cell state values that change with
each step through time This is called Long Short Term Memory
(LSTM).
Structure Of LSTM:
LSTM has a chain structure that contains four neural networks and
different memory blocks called cells.
Figure 1 Figure 2
Figure 3 Figure 4
Figure 5 Figure 6
Figure 7 Figure 8
Without the beam search, the worst time and space complexity for the
best-search would be O(bm) where b is the branching factor and m is
the maximum depth.
Beam width
Beam width or beam size, is a parameter in the beam search algorithm
which determines how many of the best partial solutions / adjacent
nodes to evaluate.
•
Bleu Score
The BLEU (Bi-Lingual Evaluation Understudy) score is a string-matching
algorithm that provides basic output quality metrics. BLEU is actually
nothing more than a method to measure the similarity between two text
strings.
• A fundamental problem/limitation with BLEU is that it DOES NOT EVEN TRY
to measure “translation quality”, but rather focuses on STRING SIMILARITY.
Scoring process
• The BLEU algorithm compares consecutive phrases of the automatic
translation with the consecutive phrases it finds in the reference
translation, and counts the number of matches.
• These matches are position independent.
• A higher match degree indicates a higher degree of similarity with the
reference translation, and higher score.
• A comparison between BLEU scores is only justifiable when BLEU results
are compared with the same Test set, the same language pair, and the
same MT engine.
• A value of 0 means that the machine-translated output has no overlap
with the reference translation (low quality) while a value of 1 means
there is perfect overlap with the reference translations (high quality).
• Interpretation
Calculating the BLEU score
To compute the BLEU score for each translation, we compute the
following statistics.
• N-Gram Precisions
The n-gram overlap counts how many unigrams, bigrams, trigrams, and
four-grams (i=1,2,3,4) match their n-gram counterpart in the reference
translations.
• Brevity-Penalty
BP stands for brevity penalty. Since BLEU is a kind of precision, short
outputs would score highly without BP. This penalty is defined simply as:
• A problem with this architecture lies in the fact that the decoder
needs to represent the entire input sequence x1, x2, x3, x4 as a single
vector c, which can cause information loss. Moreover, the decoder
needs to decipher the passed information from this single vector, a
complex task in itself.
• A State is a set of tokens that represent every state that the agent
can be in.
• A Model (sometimes called Transition Model) gives an action’s effect
in a state.
• An Action A is set of all possible actions. A(s) defines the set of
actions that can be taken being in state S.
• A Reward is a real-valued reward function.
• A Policy is a solution to the Markov Decision Process.
Bellman Equation
• Bellman equation is the basic block of solving
reinforcement learning and is omnipresent in RL. It helps us
to solve MDP. To solve means finding the optimal policy
and value functions.
• The optimal value function V*(S) is one that yields
maximum value.
• The value of a given state is equal to the max action (action
which maximizes the value) of the reward of the optimal
action in the given state and add a discount factor
multiplied by the next state’s Value from the Bellman
Equation.
Bellman Equation
Value Iteration and Policy Iteration
We solve a Bellman equation using two powerful
algorithms:
i. Value Iteration ii. Policy Iteration