Skip to content

DOC rewrite descriptions of P/R/F averages and define support #1974

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 5 commits into from

Conversation

jnothman
Copy link
Member

Following on from #1945, this is an attempt to explain the averages a different way. I'm not sure if the reiteration of the descriptions in notation is helpful, though.

@arjoly
Copy link
Member

arjoly commented May 20, 2013

You define the precision and recall as in average="sample". However, this definition failed in the multiclass case.

I prefer the more verbose definition of avg_precision, avg_recall and avg_F. I think that the redefinition of y_j and w_j for every average parameter is missleading, especially with the previous definition of y as label set .

@jnothman
Copy link
Member Author

I am a bit confused by that comment:

  • I didn't actually define y as a label set. I leave intentionally unspecified what it's a set of. P, R and F can be calculated over any two sets. So I don't see how my description is suited to average="samples".
  • Does "I prefer the more verbose definition of avg_precision..." mean you would rather each formula be expanded out? I personally like the formulas being a function of vectors y and w, but certainly think all should use the same set notation.

@arjoly
Copy link
Member

arjoly commented May 20, 2013

I didn't actually define y as a label set. I leave intentionally unspecified what it's a set of. P, R and F can be calculated over any two sets. So I don't see how my description is suited to average="samples"

In the narrative doc, you say

For these purposes, it is clearer to redefine our metrics in terms of sets.

For a true set :math:`\hat{y}` and predicted set :math:`y`, we may redefine:

.. math::

    \text{precision} = \frac{\left| y \cap \hat{y} \right|}{\left|y\right|},

.. math::

    \text{recall} = \frac{\left| y \cap \hat{y} \right|}{\left|\hat{y}\right|}.

you also say

Defining :math:`y_l` and :math:`\hat{y}_l` to be sets of samples with label
+:math:`l`

But later in the same section, :math:y_j consists of either all (sample, label) pairs, either samples assigned label :math:jor either labels assigned to sample :math:j.

So I don't see how my description is suited to average="samples".

The two definitions that you gave of precision and recall

    \text{precision} = \frac{\left| y \cap \hat{y} \right|}{\left|y\right|},
    \text{recall} = \frac{\left| y \cap \hat{y} \right|}{\left|\hat{y}\right|}.

are for one sample in the multilabel case in the case average="samples" and those will reduce to classification accuracy in the multiclass case.

Does "I prefer the more verbose definition of avg_precision..." mean you would rather each formula be expanded out? I personally like the formulas being a function of vectors y and w, but certainly think all should use the same set notation.

Yes, I prefer that each formula be expanded. I think it is more eplicit

@jnothman
Copy link
Member Author

Okay. That's not how I intended it to be read, which means it is unclear. I meant y and \hat{y} to be sets of arbitrary objects. For micro-averaging, they are sets of (sample, label) pairs; for samples averaging they are sets of (label) per sample; for the others, they are sets of (sample) per label. I thought that generalisation would help clarify it.

Would it be okay if I said explicitly that y and \hat{y} were made up of (sample, label) pairs, and then define y_l to be the subset of y with label l (perhaps y_l = {(s, l') \in y | l' = l}), and elsewhere y_s to be the subset of y with sample s or something similar?

It is precisely this sort of subset selection that defines all sorts of variations of precision and recall (e.g. variant metrics for named entity recognition that have lenient matches on boundary), and I think this set-based definition should be prominent.

@jnothman
Copy link
Member Author

How about this version? I think this makes clear the similarities and differences between the different metrics.

``'weighted'``:
Average over classes weighted by support (takes imbalance into
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may just be me, but I quite like the sentence takes imbalance into account being part of the description. Perhaps together
with what you wrote? Just a thought :)
Beyond that, I like the current version and am +1 for merge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrmh. See I thought it only needed specifying on 'macro' because it's the odd one out.

I guess what you want to get across is that "weighted" adjusts "macro" in order to take label imbalance into account. I'll try to include something like this before merge.

As Gilles said, in general, more motivation for metric choice still needs to be included in the narrative...

@jnothman
Copy link
Member Author

added notes on accounting for label imbalance in "weighted"; merged as 7ebfd57.

Does this sort of documentation change belong in What's New?

@jnothman jnothman closed this May 21, 2013
@GaelVaroquaux
Copy link
Member

Does this sort of documentation change belong in What's New?

I don't think so, sorry.

@jnothman
Copy link
Member Author

I don't mind! Just checking because I hadn't at first realised it was common to update What's New within a patch, so now I'm curious to know what qualifies.

@GaelVaroquaux
Copy link
Member

Just checking because I hadn't at first realised it was common to
update What's New within a patch, so now I'm curious to know what
qualifies.

Think of it in the shoes of a user: it's the document that you want to
read to see what has changed in scikit-learn with a new release.

@jnothman
Copy link
Member Author

Got it. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants