-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[WIP] enhance labels
and deprecate pos_label
in PRF metrics
#2610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for working on those issues !!! |
np! The fiddliness of But there is at least one problem. Consider the following parameters in the current code:
This was treated as the special binary case, i.e. the metric was returned only for the positive label (P,R,F=1/3). Because I can think of the following deprecation options for this weird case:
Also, I'm considering changing it to, by default, do the automatic handling of binary like it currently does (but with an inferred |
If someone has a moment to review this -- as an idea if not code -- I am very eager to hear comment on the deprecation edge-case I described in my previous comment. I'm also interested in whether we should maintain special handling of binary targets as a default (not just during deprecation), even if it provides confusing implicit behaviour (e.g. #2094), for convenience. |
I don't have any strong opinion. My only fear is how to keep the behaviour simple to be compatible with binary and multi-class scorers in mind.
I have seen numpy using this strategy. |
I've decided that to make things more explicit, and to enable some features On Tue, Dec 17, 2013 at 7:26 PM, Arnaud Joly notifications@github.comwrote:
|
MAINT warn of future behaviour change proposed in #2610
Will this fix #3123? Also: can I help? |
I'll try to get this to MRG soon. Should not be a big deal. I'm not sure it will intentionally fix #3123, but I can make a point of doing so so as not to make conflicts. |
Thanks, let me know if you feel ready for review. |
@amueller, @arjoly and anyone else interested: I've recalled that this was somewhat muddied by #2679 being altered to have an I think that stopping special handling of binary data needs to happen before So I propose that, at least for now, we keep At the same time, recall this is not just an API fix, but supports an additional (useful) case wherein Rather, this PR or its successor should:
Thus in the future, (This has taken an alarming effort to reason through, so I hope I've come to correct conclusions! At some point soon I'll try to implement them, although I don't look forward to testing all these possibilities and their backwards-compatibilities.) |
That leads me to split this into two PRs: one that basically finishes up the work of #2679 to handle the converse case; the other to extend and fix the handling of |
(The other reason I abandoned this PR was because of the splitting of |
Closing this incarnation. |
This intends to make the parameter
labels
clearly defined for theprecision_recall_fscore_support
family of metrics. This amounts to a (partial) fix for #1983, #1989, #2029, #2094, #3122. As implied by the comment I have committed, labels will:pos_label
is determined implicitly.-There are some potential issues in the deprecation process, and in making binary classifier metrics available as Scorers...
labels
functionalitylabels
functionalitylabels
is set correctly for legacy functionalitypos_label
average_precision_score
,roc_auc_score
and any other binary-averaged metrics.This should not be merged until after the 0.15 release to allow at least one release with the warning merged from #2952