DOC rewrite descriptions of P/R/F averages and define support #1974

jnothman · 2013-05-20T03:37:39Z

Following on from #1945, this is an attempt to explain the averages a different way. I'm not sure if the reiteration of the descriptions in notation is helpful, though.

arjoly · 2013-05-20T12:00:45Z

You define the precision and recall as in average="sample". However, this definition failed in the multiclass case.

I prefer the more verbose definition of avg_precision, avg_recall and avg_F. I think that the redefinition of y_j and w_j for every average parameter is missleading, especially with the previous definition of y as label set .

jnothman · 2013-05-20T13:06:12Z

I am a bit confused by that comment:

I didn't actually define y as a label set. I leave intentionally unspecified what it's a set of. P, R and F can be calculated over any two sets. So I don't see how my description is suited to average="samples".
Does "I prefer the more verbose definition of avg_precision..." mean you would rather each formula be expanded out? I personally like the formulas being a function of vectors y and w, but certainly think all should use the same set notation.

arjoly · 2013-05-20T15:17:45Z

I didn't actually define y as a label set. I leave intentionally unspecified what it's a set of. P, R and F can be calculated over any two sets. So I don't see how my description is suited to average="samples"

In the narrative doc, you say

For these purposes, it is clearer to redefine our metrics in terms of sets.

For a true set :math:`\hat{y}` and predicted set :math:`y`, we may redefine:

.. math::

    \text{precision} = \frac{\left| y \cap \hat{y} \right|}{\left|y\right|},

.. math::

    \text{recall} = \frac{\left| y \cap \hat{y} \right|}{\left|\hat{y}\right|}.

you also say

Defining :math:`y_l` and :math:`\hat{y}_l` to be sets of samples with label
+:math:`l`

But later in the same section, :math:y_j consists of either all (sample, label) pairs, either samples assigned label :math:jor either labels assigned to sample :math:j.

So I don't see how my description is suited to average="samples".

The two definitions that you gave of precision and recall

    \text{precision} = \frac{\left| y \cap \hat{y} \right|}{\left|y\right|},
    \text{recall} = \frac{\left| y \cap \hat{y} \right|}{\left|\hat{y}\right|}.

are for one sample in the multilabel case in the case average="samples" and those will reduce to classification accuracy in the multiclass case.

Does "I prefer the more verbose definition of avg_precision..." mean you would rather each formula be expanded out? I personally like the formulas being a function of vectors y and w, but certainly think all should use the same set notation.

Yes, I prefer that each formula be expanded. I think it is more eplicit

jnothman · 2013-05-20T20:09:03Z

Okay. That's not how I intended it to be read, which means it is unclear. I meant y and \hat{y} to be sets of arbitrary objects. For micro-averaging, they are sets of (sample, label) pairs; for samples averaging they are sets of (label) per sample; for the others, they are sets of (sample) per label. I thought that generalisation would help clarify it.

Would it be okay if I said explicitly that y and \hat{y} were made up of (sample, label) pairs, and then define y_l to be the subset of y with label l (perhaps y_l = {(s, l') \in y | l' = l}), and elsewhere y_s to be the subset of y with sample s or something similar?

It is precisely this sort of subset selection that defines all sorts of variations of precision and recall (e.g. variant metrics for named entity recognition that have lenient matches on boundary), and I think this set-based definition should be prominent.

Thanks to arjoly

jnothman · 2013-05-21T00:23:18Z

How about this version? I think this makes clear the similarities and differences between the different metrics.

jaquesgrobler · 2013-05-21T10:48:13Z

sklearn/metrics/metrics.py

        ``'weighted'``:
-            Average over classes weighted by support (takes imbalance into


It may just be me, but I quite like the sentence takes imbalance into account being part of the description. Perhaps together
with what you wrote? Just a thought :)
Beyond that, I like the current version and am +1 for merge

Hrmh. See I thought it only needed specifying on 'macro' because it's the odd one out.

I guess what you want to get across is that "weighted" adjusts "macro" in order to take label imbalance into account. I'll try to include something like this before merge.

As Gilles said, in general, more motivation for metric choice still needs to be included in the narrative...

jnothman · 2013-05-21T11:57:06Z

added notes on accounting for label imbalance in "weighted"; merged as 7ebfd57.

Does this sort of documentation change belong in What's New?

GaelVaroquaux · 2013-05-21T21:06:31Z

Does this sort of documentation change belong in What's New?

I don't think so, sorry.

jnothman · 2013-05-21T21:22:10Z

I don't mind! Just checking because I hadn't at first realised it was common to update What's New within a patch, so now I'm curious to know what qualifies.

GaelVaroquaux · 2013-05-21T21:23:44Z

Just checking because I hadn't at first realised it was common to
update What's New within a patch, so now I'm curious to know what
qualifies.

Think of it in the shoes of a user: it's the document that you want to
read to see what has changed in scikit-learn with a new release.

jnothman · 2013-05-21T21:29:08Z

Got it. Thanks.

DOC rewrite descriptions of P/R/F averages and define support

6e9a3bb

jnothman added 2 commits May 21, 2013 10:20

DOC cleaner descriptions of PRF averages

c2c8dd8

Thanks to arjoly

DOC avoid 'single-label case'

799060f

Remove spurious {}

d15365f

jaquesgrobler reviewed May 21, 2013
View reviewed changes

Note on 'weighted' average and imbalance

71590ef

jnothman closed this May 21, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DOC rewrite descriptions of P/R/F averages and define support #1974

DOC rewrite descriptions of P/R/F averages and define support #1974

Uh oh!

jnothman commented May 20, 2013

Uh oh!

arjoly commented May 20, 2013

Uh oh!

jnothman commented May 20, 2013

Uh oh!

arjoly commented May 20, 2013

Uh oh!

jnothman commented May 20, 2013

Uh oh!

jnothman commented May 21, 2013

Uh oh!

jaquesgrobler May 21, 2013

Uh oh!

jnothman May 21, 2013

Uh oh!

jnothman commented May 21, 2013

Uh oh!

GaelVaroquaux commented May 21, 2013

Uh oh!

jnothman commented May 21, 2013

Uh oh!

GaelVaroquaux commented May 21, 2013

Uh oh!

jnothman commented May 21, 2013

Uh oh!

Uh oh!

		``'weighted'``:
		Average over classes weighted by support (takes imbalance into

Uh oh!

DOC rewrite descriptions of P/R/F averages and define support #1974

DOC rewrite descriptions of P/R/F averages and define support #1974

Uh oh!

Conversation

jnothman commented May 20, 2013

Uh oh!

arjoly commented May 20, 2013

Uh oh!

jnothman commented May 20, 2013

Uh oh!

arjoly commented May 20, 2013

Uh oh!

jnothman commented May 20, 2013

Uh oh!

jnothman commented May 21, 2013

Uh oh!

jaquesgrobler May 21, 2013

Choose a reason for hiding this comment

Uh oh!

jnothman May 21, 2013

Choose a reason for hiding this comment

Uh oh!

jnothman commented May 21, 2013

Uh oh!

GaelVaroquaux commented May 21, 2013

Uh oh!

jnothman commented May 21, 2013

Uh oh!

GaelVaroquaux commented May 21, 2013

Uh oh!

jnothman commented May 21, 2013

Uh oh!

Uh oh!