What Is Cross-Entropy?: 1 Answer
What Is Cross-Entropy?: 1 Answer
What Is Cross-Entropy?: 1 Answer
- Stack Overflow
What is cross-entropy?
Asked 3 years, 3 months ago Active 9 months ago Viewed 46k times
I know that there are a lot of explanations of what cross-entropy is, but I'm still confused.
Is it only a method to describe the loss function? Can we use gradient descent algorithm to
90 find the minimum using the loss function?
machine-learning cross-entropy
59
edited Mar 29 '19 at 18:53 asked Feb 1 '17 at 21:38
nbro theateist
9,861 15 71 125 11.9k 13 59 95
10 Not a good fit for SO. Here's a similar question on the datascience sister site:
datascience.stackexchange.com/questions/9302/… – Metropolis Feb 1 '17 at 21:59
For example, suppose for a specific training instance, the label is B (out of the possible labels
A, B, and C). The one-hot distribution for this training instance is therefore:
You can interpret the above "true" distribution to mean that the training instance has 0%
probability of being class A, 100% probability of being class B, and 0% probability of being
class C.
Now, suppose your machine learning algorithm predicts the following probability distribution:
How close is the predicted distribution to the true distribution? That is what the cross-entropy
loss determines. Use this formula:
Where p(x) is the wanted probability, and q(x) the actual probability. The sum is over the
three classes A, B, and C. In this case the loss is 0.479 :
By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
our Terms of
H =Service.
- (0.0*ln(0.228) + 1.0*ln(0.619) + 0.0*ln(0.153)) = 0.479
https://stackoverflow.com/questions/41990250/what-is-cross-entropy 1/3
5/6/2020 machine learning - What is cross-entropy? - Stack Overflow
So that is how "wrong" or "far away" your prediction is from the true distribution.
Cross entropy is one out of many possible loss functions (another popular one is SVM hinge
loss). These loss functions are typically written as J(theta) and can be used within gradient
descent, which is an iterative algorithm to move the parameters (or coefficients) towards the
optimum values. In the equation below, you would replace J(theta) with H(p, q) . But note
that you need to compute the derivative of H(p, q) with respect to the parameters first.
Correct, cross-entropy describes the loss between two probability distributions. It is one of
many possible loss functions.
Then we can use, for example, gradient descent algorithm to find the minimum.
Yes, the cross-entropy loss function can be used as part of gradient descent.
so, cross-entropy describes the loss by sum of probabilities for each example X. – theateist Feb 1 '17
at 22:34
so, can we instead of describing the error as cross-entropy, describe the error as an angle between two
vectors (cosine similarity/ angular distance) and try to minimize the angle? – theateist Feb 1 '17 at
22:55
1 apparently it's not the best solution, but I just wanted to know, in theory, if we could use cosine
(dis)similarity to describe the error through the angle and then try to minimize the angle. –
theateist Feb 2 '17 at 17:22
2 @Stephen: If you look at the example I gave, p(x) would be the list of ground-truth probabilities for
each of the classes, which would be [0.0, 1.0, 0.0 . Likewise, q(x) is the list of predicted
probability for each of the classes, [0.228, 0.619, 0.153] . H(p, q) is then - (0 * log(2.28) +
1.0 * log(0.619) + 0 * log(0.153)) , which comes out to be 0.479. Note that it's common to use
Python's np.log() function, which is actually the natural log; it doesn't matter. –
stackoverflowuser2010 Oct 20 '17 at 23:02
1@HAr: For one-hot encoding of the true label, there is only one non-zero class that we care about.
However, cross-entropy can compare any two probability distributions; it is not necessary that one of
By using our site,
themyou
hasacknowledge that you –have
one-hot probabilities. read and understand
stackoverflowuser2010 Febour
13Cookie Policy, Privacy Policy, and
'18 at 20:30
our Terms of Service.
https://stackoverflow.com/questions/41990250/what-is-cross-entropy 2/3
5/6/2020 machine learning - What is cross-entropy? - Stack Overflow
By using our site, you acknowledge that you have read and understand our Cookie Policy, Privacy Policy, and
our Terms of Service.
https://stackoverflow.com/questions/41990250/what-is-cross-entropy 3/3