ML Unit 3 Part 1
ML Unit 3 Part 1
ML Unit 3 Part 1
Bayesian Concept
Introduction
● This Chapter introduces rules of probability and possible
uses of probability, distribution functions, and hypothesis
testing principles in the machine learning domain.
● In18th century mathematician Thomas Bayes.
He developed the foundational mathematical principles,
known as Bayesian methods, which describe the probability
of events
WHY BAYESIAN METHODS ARE IMPORTANT?
● On the basis of assumption 3, we can say that each hypothesis h within the
space H has equal prior probability,
● we can say that these prior probabilities sum up to 1. So, we can write
● Using assumption 1 mentioned above, we can say that if T is consistent
with h,
Naïve Bayes classifier
● It is a simple technique for building classifiers:
● Here models that assign class labels to problem instances.
● The basic idea of Bayes rule is that the outcome of a hypothesis can be
● predicted on the basis of some evidence (E) that can be observed.
● Bayes rule, we observed that
● 1. A prior probability of hypothesis h or P(h):
● This is the probability of an event or hypothes is before the evidence is
observed.
● 2. A posterior probability of h or P(h|D): This is the probability of an
event after the evidence is observed within the population D.
For example, a person has height and weight of 182 cm and 68 kg,
respectively.
What is the probability that this person belongs to the class ‘basketball
player’?
● We take a learning task where each instance x has some attributes and
the target function (f(x)) can take any value from the finite set of
classification values C.
● We also have a set of training examples for target function, and the set
● of attributes {a , a ,…, a } for the new instance are known to us. Our
task is to predict the classification of the new instance.
● According to the approach in Bayes’ theorem, the classification of the
new instance is performed by assigning the most probable target
classification C on the basis of the attribute values of the new
instance {a , a ,…, a }. So
● So, to get the most probable classifier, we have to evaluate the two
terms P(a , a , c, a |c ) and P(ci).
● In a practical scenario, it is possible to calculate P(ci) by calculating
the frequency of each target value c in the training data set.
● Naïve Bayes classifier makes a simple assumption that the attribute
values are conditionally independent of each other for the target value.
● So, applying this simplification, we can now say that for a target value
of an instance, the probability of observing the combination a ,a ,…, a
is the product of probabilities of individual attributes P(ai |ci ).
A key benefit of the naive Bayes classifier is that it requires only a little bit
of training information (data) to gauge the parameters (mean and differences
of the variables) essential for the classification (arrangement).
Strengths and Weaknesses of Bayes Classifiers
Naïve Bayes classifier steps
● Step 1: First construct a frequency table. A frequency table is drawn
for each attribute against the target outcome.
● For example, in Figure, the various attributes are
(1) Weather Condition,
(2) How many matches won by this team in last three matches,
(3) Humidity Condition, and
(4) whether they won the toss and the target outcome is will they win the match
or not?
● Step 2: Identify the cumulative probability for ‘Won match = Yes’ and the
probability for ‘Won match = No’ on the basis of all the attributes.
Construct Frequency Table
● Step 3: Calculate probability through normalization by applying the
below formula
Solving the above problem with Naive Bayes