ENH: add normalize parameter to metrics.classification.confusion_matrix #14478

Sam1320 · 2019-07-26T09:16:21Z

Allows to get a normalized confusion matrix directly from the function
call. I use confusion_matrix frequently and find the need to always
normalize the matrix manually maybe unnecessary.

I am aware of the fact that other functions like accuracy_score already
have this exact functionality implemented, so probably the lack of the
normalize parameter is intentional and I'm missing the why. But in case
its not intentional you might find this contribution useful :).

Allows to get a normalized confusion matrix directly from the function call. I use this function frequently and find the need to always normalize the matrix manually a bit tedious.

jnothman · 2019-07-29T00:19:50Z

sklearn/metrics/classification.py

@@ -211,6 +212,10 @@ def confusion_matrix(y_true, y_pred, labels=None, sample_weight=None):
        If none is given, those that appear at least once
        in ``y_true`` or ``y_pred`` are used in sorted order.

+    normalize : bool, optional (default=False)
+        If True, , return the fraction of classified samples (float),


You can normalise by number of samples (like accuracy), by number of samples in the ground truth for each class (like recall), or by number of samples predicted for each class (like precision). I find this description highly ambiguous, and I'm not persuaded we should assume one normalisation is more appropriate than another for the user.

Good point. I did it that way because most frequently the normalization is done with regard to the ground truth. When we think of a normalized confusion matrix the picture that comes to mind is precisely something like this.

To avoid suggesting to the user just one specific type of normalization, different arguments for the normalize parameter could be specified, e.g, "precision/accuracy/recall".

jnothman · 2019-07-29T08:39:38Z

I'm okay with giving options. I think calling it precision/recall/accuracy is a bit misleading since they don't pertain off the diagonal of the matrix. true vs pred might be better names. It's still not entirely clear to me that providing this facility is of great benefit to users.

glemaitre · 2019-07-30T15:34:14Z

sklearn/metrics/classification.py

@@ -184,7 +184,8 @@ def accuracy_score(y_true, y_pred, normalize=True, sample_weight=None):
    return _weighted_sum(score, sample_weight, normalize)


-def confusion_matrix(y_true, y_pred, labels=None, sample_weight=None):
+def confusion_matrix(y_true, y_pred, labels=None,
+                     normalize=False, sample_weight=None):


You should keep sample_weight before normalize just in case.

glemaitre · 2019-07-30T15:37:38Z

With your proposal, you also need to implement tests to ensure that the function will work properly.

Sam1320 added 2 commits July 26, 2019 10:55

ENH: add normalize parameter to metrics.classification.confusion_matrix

d5f7b47

Allows to get a normalized confusion matrix directly from the function call. I use this function frequently and find the need to always normalize the matrix manually a bit tedious.

STY: apply PEP8 line too long recommendation

2666c0e

jnothman reviewed Jul 29, 2019

View reviewed changes

glemaitre reviewed Jul 30, 2019

View reviewed changes

qinhanmin2014 mentioned this pull request Nov 13, 2019

[MRG] ENH Adds plot_confusion matrix #15083

Merged

glemaitre mentioned this pull request Nov 14, 2019

[MRG] ENH add normalize parameter to confusion_matrix #15625

Merged

qinhanmin2014 closed this in #15625 Nov 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: add normalize parameter to metrics.classification.confusion_matrix #14478

ENH: add normalize parameter to metrics.classification.confusion_matrix #14478

Uh oh!

Sam1320 commented Jul 26, 2019 •

edited

Loading

Uh oh!

jnothman Jul 29, 2019

Uh oh!

Sam1320 Jul 29, 2019 •

edited

Loading

Uh oh!

jnothman commented Jul 29, 2019 via email

Uh oh!

glemaitre Jul 30, 2019

Uh oh!

glemaitre commented Jul 30, 2019

Uh oh!

Uh oh!

Uh oh!

ENH: add normalize parameter to metrics.classification.confusion_matrix #14478

ENH: add normalize parameter to metrics.classification.confusion_matrix #14478

Uh oh!

Conversation

Sam1320 commented Jul 26, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman Jul 29, 2019

Choose a reason for hiding this comment

Uh oh!

Sam1320 Jul 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Jul 29, 2019 via email

Uh oh!

glemaitre Jul 30, 2019

Choose a reason for hiding this comment

Uh oh!

glemaitre commented Jul 30, 2019

Uh oh!

Uh oh!

Sam1320 commented Jul 26, 2019 •

edited

Loading

Sam1320 Jul 29, 2019 •

edited

Loading