Probability Estimation

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 11

PROBABILITY ESTIMATION

PROBABILITY
• Probability is a measure of the likelihood of an event to
occur. Many events cannot be predicted with total
certainty. We can predict only the chance of an event to
occur i.e., how likely they are going to happen, using it.
Probability can range from 0 to 1, where 0 means the
event to be an impossible one and 1 indicates a certain
event. The probability of all the events in a sample
space adds up to 1.
• For example, when we toss a coin, either we get Head
OR Tail, only two possible outcomes are possible (H,
T). But when two coins are tossed then there will be four
possible outcomes,  i.e {(H, H), (H, T), (T, H), (T, T)}.
Formula for Probability

• The probability formula is defined as the possibility


of an event to happen is equal to the ratio of the
number of favourable outcomes and the total
number of outcomes.
• Probability of event to happen P(E) = Number of
favourable outcomes/Total Number of outcomes
• 1) There are 6 pillows in a bed, 3 are red, 2 are
yellow and 1 is blue. What is the probability of
picking a yellow pillow?
• Ans: The probability is equal to the number of
yellow pillows in the bed divided by the total
number of pillows, i.e. 2/6 = 1/3.
What Is Prior Probability?
Prior probability, in Bayesian statistics, is the probability of an
event before new data is collected. This is the best rational
assessment of the probability of an outcome based on the current
knowledge before an experiment is performed.
The prior probability of an event will be revised as new data or
information becomes available, to produce a more accurate
measure of a potential outcome.
The number of desired outcomes is 3 (rolling a 2, 4, or 6), and
there are 6 outcomes in total. The a priori probability for this
example is calculated as follows: A priori probability = 3 / 6 = 50%.
Therefore, the a priori probability of rolling a 2, 4, or 6 is 50%.
EXAMPLE
For example, three acres of land have the labels A,
B, and C. One acre has reserves of oil below its
surface, while the other two do not. The prior
probability of oil being found on acre C is one third,
or 0.333. But if a drilling test is conducted on acre
B, and the results indicate that no oil is present at
the location, then the posterior probability of oil
being found on acres A and C become 0.5, as each
acre has one out of two chances.
Bayes' Theorem
P(A∣B) = P(A∩B)​/ P(B) = P(A) × P(B∣A)​/P(B)
where:
P(A) = the prior probability of A occurring
P(A∣B)= the conditional probability of A  given that B o
ccurs
P(B∣A) = the conditional probability of B   given that A 
occurs
P(B) = the probability of B occurring​
Cons…..
• If we are interested in the probability of an event of which
we have prior observations; we call this the prior
probability. We'll deem this event A, and its probability
P(A). If there is a second event that affects P(A), which
we'll call event B, then we want to know what the
probability of A is given B has occurred. In probabilistic
notation, this is P(A|B), and is known as posterior
probability or revised probability. This is because it has
occurred after the original event, hence the post in
posterior. This is how Baye’s theorem uniquely allows us
 to update our previous beliefs with new information.
Base rate bias
A more complex example involving conditional probabilities is given by
Casscells, Schoenberger and Grayboys (1978), and relates to the problem
of ‘false positives’. This involves a situation where a person takes a medical
test, maybe for a disease like HIV, where there is a very low probability (in
most circumstances) of having the disease, say one in a thousand.
However, there is a chance of a false prediction; the test may only be 95%
accurate. Under these circumstances people tend to ignore the rarity of
the phenomenon (disease) in the population, referred to as the base rate,
and wildly overestimate the probability of actually being sick. Even the
majority of Harvard Medical School doctors failed to get the right answer.
For every thousand patients tested, one will be actually sick while there
will be fifty false positives. Thus there is only a one in fifty-one chance of a
positive result meaning that the patient is actually sick.
This example can be explained in more detail using Bayes’ theorem. For
simplicity, it is assumed initially that if the patient has the disease the
test returns a positive result 100% of the time, meaning that there are
no false negatives. Let A represent the condition in which the patient
has the disease, and B represent the evidence of a positive test result.
Then, the probability that the patient actually has the disease given the
positive test result is
P(A|B) = P(B|A)P(A)/ P(B|A)P(A) + P(B|not A)P(not A)
= 1 × 0.001 / 1 × 0.001 + 0.05 × 0.999 = 0.0196
This means that the probability that a positive result is a false positive is
about 1 − 0.0196 = 0.98, or 98%. If, more realistically, there is also a
chance of the test returning a false negative, this would mean that P(B|
A) < 1, and this would modify the result slightly. The difference would
be small, assuming that the chance of a false negative is low; for
example, if the probability of a negative result given the person has the
disease is 0.99, then P(A|B) = 0.0194.
The ‘law of small numbers’
The main error here is when people apply principles that apply to
infinite populations to small samples. We will examine the model
described by Rabin (2002a). This model examines the situation where
people are observing a sequence of signals from a process that involves
independent and identically distributed random variables. This means
that each random variable has the same probability distribution as the
others and all are mutually independent. A simple example is a
sequence of coin tosses, where the probability distribution is 0.5 for a
head and 0.5 for a tail for each toss, and the outcome of each toss has
no effect on the outcome of any other toss. The model assumes that
people believe, incorrectly, that the signals are drawn from an urn of
finite size without replacement, whereas the correct assumption in this
case is that there is replacement after each draw from the urn
The ‘gambler’s fallacy’ effect
This effect derives its name from the observation that gamblers frequently
expect a certain slot machine or a number that has not won in a while to be ‘due’
to win. We fi nd that this effect occurs when the distribution of signals is known,
as it is with the coin toss situation. If an urn contains 10 balls, 5 representing Up
and 5 representing Down, and one ball is drawn at a time with replacement, this
experiment is identical to tossing a coin. Thus if 3 successive draws all result in an
Up outcome (equivalent to 3 heads in a row), then the rational person will
estimate the probability of an Up on the next draw as 0.5. However, if the person
believes that the balls are not being replaced, this means that there is only 2 Up
balls left in the urn out of 7 balls in total, so they will estimate the probability of
the next draw being Up as only 2/7 or about 0.286, with the probability of Down
being 0.714. This is an example of the representativeness heuristic, in that the
sequence Up, Up, Up, Down is judged as being more representative of the
population than the sequence Up, Up, Up, Up.

You might also like