-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
ENH P/R/F should be able to ignore a majority class in the multiclass case #1983
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Say we add
and if So it needs some special values like:
WDYT? |
Also, @arjoly, can we not assume as a rule that a label indicator matrix consists of 0s and 1s, or Falses and Trues? Do we really need to use |
This feature could be remove.
Do you have some references about this? |
Even if we're not assured the others are zeros, I think
I haven't gone looking for them, but for example, see the last comment at http://metaoptimize.com/qa/questions/8284/does-precision-equal-to-recall-for-micro-averaging which suggests the case where you "have some samples which are not classified to belong to any of known classes" in an otherwise single-label multiclass task. It's easy to come up with such classification tasks, such as classifying Wikipedia articles into non-overlapping named entity categories: the vast majority of articles are non-entities, and a micro-average or a macro-average would be a good measure of performance as long as that non-entity class is not taken into account. There's nothing special about the binary case: it's just interpreted as if it's multilabel with a single class. We should also be able to treat multiclass as if it's zero-or-one label. |
Sorry for the dumb questions. but what is a named entity category? And what is a non entity in this context?
The last comment simply states that when you have at least one sample with no true label and no predicted label in the multilabel case will have different micro-precision, micro-recall and micro-f1score.
Before you rewrote the average description, they were a comment about this in the docstring
Quid if your classifier is not able to identificate the negative class? I am not against the idea of implementing this functionality, but I wouldn't want to see this without some good references. |
I'm using terms from Named Entity Recognition. Basically, let's say we want to decide if a Wikipedia topic is a Person, Location or Organisation, and the rest is noise. That is a multiclass classification problem with a vast negative class. I have published work on an expanded version of this classification task, but do not explicitly describe my calculation of micro-F. I in fact report micro-F excluding multiple negative classes output by my classifier. Perhaps I should support that too.
It said >>> import sklearn.metrics
>>> print(sklearn.metrics.precision_recall_fscore_support([[0, 1]], [[0]], average='micro'))
(1.0, 0.5, 0.667, None) [Actually, that's not what it output at master. Apparently there's a bug in your implementation -- one I haven't yet investigated but we need to test -- which returns And often people wouldn't consider multiclass classification with a negative class as multilabel classification, just as binary classification isn't considered multilabel classification. Multilabel implies the system may output [0 .. n_labels] outputs per sample. These cases are {0, 1}.
I don't know what this means. I still haven't got my hands on any references. However, for the multiclass case that you speak of, where every sample is assigned one meaningful class, you get |
(And the bug in your implementation is that you treat |
Fixed in #4287 |
P/R/F are famous for handling class imbalance in the binary classification case. Correct me if I'm wrong (@arjoly?), but imbalance against a majority negative class should also be handled in the multiclass case. In particular, while the documentation currently states that micro-averaged P = R = F, this is not true of the case where a negative class is ignored; but it should be possible to ignore a negative class for any of the
average
settings.Indeed, I think the
pos_label
argument is a mistake (except in that you can more reliably provide a default value than forneg_label
): it only applies to the binary case and overrides the average setting;neg_label
would apply to all multiclass averaging methods.It should be easy to implement: treat the problem as multilabel and delete the
neg_label
column from the label indicator matrix. I.e. it is the case where each instance is assigned 0 or 1 label.The tricky part is the interface: should
pos_label
be deprecated? Deprecation makes sense aspos_label
andneg_label
should not be necessary together. But if so, how do we ensure the binary case works by default?The text was updated successfully, but these errors were encountered: