-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
[MRG] Add specificity score as a metric #10831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jnothman FYI. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to either not support multiclass or document the multiclass behaviour better. Is this a standard definition for the multiclass case? The binary case can also be calculated with recall_score, though.
How will we calculate from the recall_score? Perhaps you are referring to Sensitivity, and not Specificity as I aim to compute here? In multiclass cases, it is often a requirement to compute the False Positive Rates for each class to evaluate the model. Here, by computing and returning the Specificity for each class, we can allow the user to refer to either individual values, or compute the macro/micro average from the returned array. If you would like, I can document the multiclass stating the above idea or with more examples? |
I meant recall_score(y, y', pos_label=0) for instance.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add specificity_score to sklearn/metrics/tests/test_common.py
.
You might also want to see #10628 which may make the implementation of specificity_score less necessary, or at least simplify the implementation, by providing |
And your tests are currently failing. |
Well the Let me know. |
Well, I'm not yet sure about the implementation of
multilabel_confusion_matrix yet. It's not been benchmarked, for instance,
and some of the code has become a bit hacky and could be neater. If you'd
like to take on benchmarking that and bringing it to completion, I'd be
interested in having it off my hands!!
I had also thought we should consider a 'specificity' scorer for the binary
case in sklearn/metrics/scorer.py
|
Reference Issues/PRs
Please see Issue #10391
What does this implement/fix? Explain your changes.
As per the discussion, instead of adding a False Positive Rate metric, adding a Specificity score metric.
Any other comments?