Confusion Matrix Representation / Return Value #19012

shubhamdo · 2020-12-15T17:36:54Z

Describe the workflow you want to enable

An enhancement to the output of confusion matrix function, better representing the true and predicted values for multilevel classes.

i.e. Current Representation with code:
from sklearn.metrics import confusion_matrix
y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
Output:
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])

Describe your proposed solution

When you have multiple levels you can have difficulty reading the ndarray, associating the levels with the True and Predicted values.

Proposed solution should look similar to the table below, providing better readability of the confusion matrix.

		*Predicted*	*Value*
	*Levels*	ant	bird	cat
*True*	ant	2	0	0
*Value*	bird	0	0	1
	cat	1	0	2

Possible Solutions:

Provide a parameter to prettyprint the matrix. printMatrix [type:bool]
Include another parameter to return ndarray, index as true_values, columns as predicted_values
For example:
cm = confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
index=["true:ant", "true:bird", "true:cat"]
columns=["pred:ant", "pred:bird", "pred:cat"]
return cm, index, columns
Which can be easily converted into a dataframe for further use

The text was updated successfully, but these errors were encountered:

jnothman · 2020-12-16T11:59:26Z

I agree that the current output is unnecessarily difficult, and the confusion matrix is naturally portrayed as a dataframe... not sure if this is something we would consider introducing as a breaking change, for which we would require a hard dep on pandas...

glemaitre · 2020-12-16T18:19:27Z

Could we return a Python dict or a Bunch to go toward this direction but without a hard dependency?

shubhamdo · 2020-12-16T18:52:54Z

@jnothman @glemaitre Actually I'm suggesting an alternative output which shouldn't be affecting dependent codes/breaking changes.

For example: By adding a parameter, if parameter prettyPrint = True then,
return cm, index, columns or to an extendible generic object such as dict, custom structured object. (I'm thinking what that could be, not adding a dep on pandas. I'll get back soon on this.)
and will print the confusion matrix like a table representation as mentioned earlier.

By default when parameter prettyPrint = False then, standard output as we see at present.
return cm
i.e cm = ndarray()
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])

I would be working on the issue, if you think it's a viable issue/feature.

shubhamdo · 2020-12-29T17:55:54Z

Hi, I was suggesting the output as shown in the screenshot below. I have added a parameter to the existing function called as pprint, so if it's True will show the output matrix. The return object would remain the same (i.e. ndarray).

Example:

from sklearn.metrics._classification import confusion_matrix
y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
cm = confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"], pprint=True)

Output:

Function:

def confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None,
                     normalize=None, pprint=False):
........
################# To pretty print the confusion matrix ###############

 if pprint:
        labelList = labels.tolist()

        cm_arr = cm.tolist()
        for i in range(0, len(cm_arr)):
            cm_arr[i].insert(0, labelList[i])
            if i == (len(cm_arr) // 2):
                cm_arr[i].insert(0, "Predicted")
            else:
                cm_arr[i].insert(0, "         ")

        labelList.insert(0, "Lvl")
        labelList.insert(0, "           ")
        a = [{labelList[j]: i[j] for j in range(0, len(i))} for i in cm_arr]

        myList = [labelList]
        for item in a:
            myList.append([str(item[col] if item[col] is not None else '') for col in labelList])

        # Gets the column size for better printing
        colSize = [max(map(len, col)) for col in zip(*myList)]

        # To add seperations based on the max len of the columns
        formatStr = ' | '.join(["{{:<{}}}".format(i) for i in colSize])

        # Seperating line of the header
        myList.insert(1, [' ' * colSize[i] if i == 0 else '-' * colSize[i] for i in range(0, len(colSize))])

        print("              *** Confusion Matrix ***")
        print("                         True")

        # Add to the formatted structure
        for item in myList:
            print(formatStr.format(*item))

############################################################
    with np.errstate(all='ignore'):
        if normalize == 'true':
            cm = cm / cm.sum(axis=1, keepdims=True)
        elif normalize == 'pred':
            cm = cm / cm.sum(axis=0, keepdims=True)
        elif normalize == 'all':
            cm = cm / cm.sum()
        cm = np.nan_to_num(cm)

    return cm```

jnothman · 2020-12-30T02:41:02Z

The rows should be true values, the columns predicted.

I don't think we want pprint like that, although I admit that whether we return a dict or a DataFrame or index and columns it remains tricky to communicate which is true and which is predicted.

shubhamdo · 2020-12-30T18:16:49Z

@jnothman @glemaitre taking your comments in due consideration, I've added another change so that it returns an output in a default data type i.e. dict() Eliminating need for hard dependencies, increasing usability with any other 3rd party libs too.

Please refer to the example code and screenshots. Thanks!

Example:
from sklearn.metrics._classification import confusion_matrix
y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
cm = confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"], pprint=True)

Code: Output as a dict()

    if pprint:
        labelList = labels.tolist()

        cm_lol = cm.tolist()
        cm_dict = {str(labelList[j]): {str(labelList[i]): cm_lol[i][j] for i in
                                                 range(0, len(labelList))} for j in range(0, len(cm_lol))}

        return cm_dict

Output w/o pprint(False):

array([[2, 0, 0],
       [0, 0, 1],
       [1, 0, 2]], dtype=int64)

Output w/ pprint(True):

{'ant': {'ant': 2, 'bird': 0, 'cat': 1}, 
'bird': {'ant': 0, 'bird': 0, 'cat': 0},
 'cat': {'ant': 0, 'bird': 1, 'cat': 2}}

For this solution changes required:

def confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None,
                     normalize=None, pprint=False):
....... 
 if pprint: # Logic
      return cm_dict #As the suggested output
.....

return cm

Option 2: For better understanding the true and pred values.

    if pprint:
        labelList = labels.tolist()

        cm_lol = cm.tolist()
        cm_dict = {"pred_" + str(labelList[j]): {"true_" + str(labelList[i]): cm_lol[i][j] for i in
                                                 range(0, len(labelList))} for j in range(0, len(cm_lol))}

        return cm_dict

Output:

{'pred_ant': {'true_ant': 2, 'true_bird': 0, 'true_cat': 1},
 'pred_bird': {'true_ant': 0, 'true_bird': 0, 'true_cat': 0}, 
'pred_cat': {'true_ant': 0, 'true_bird': 1, 'true_cat': 2}}

yash-clear · 2021-01-01T20:25:00Z

@shubhamdo

`from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
cm=confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])

plot the confusion matrix

f,ax = plt.subplots(figsize=(10, 10))
sns.heatmap(cm, annot=True, linewidths=0.01,cmap=sns.cubehelix_palette(8),fmt= '.1f',ax=ax)
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title("Confusion Matrix")
plt.show()`

jnothman · 2021-01-03T01:14:26Z

I'm okay with something like your Option 2, but I don't really like the use of underscores and am a little uncomfortable about coercing class names to strings. Does returning tuples as keys, such as ("true", 1) work with casting to DataFrame? Another option is to have a flat dict with {(true_class, pred_class): count} which is a collapsed form of the matrix.

shubhamdo · 2021-01-03T14:56:16Z

Returning tuples as keys within the nested dict work well with casting to DataFrame, please refer below dict
structure representing the same.

{('pred', 'ant'): {('true', 'ant'): 2, ('true', 'bird'): 0, ('true', 'cat'): 1},
('pred', 'bird'): {('true', 'ant'): 0, ('true', 'bird'): 0, ('true', 'cat'): 0}, 
('pred', 'cat'): {('true', 'ant'): 0, ('true', 'bird'): 1, ('true', 'cat'): 2}}

Screenshot:

Another Option: Flat Dict
Here, I think the problem with flat dict option is the casting the dataframe will need extra steps.
If we need one step conversion then the nested dict with tuples as keys is good as seen above.

Flat Dict:

{('pred_ant', 'true_ant'): 2, ('pred_ant', 'true_bird'): 0, ('pred_ant', 'true_cat'): 1,
 ('pred_bird', 'true_ant'): 0, ('pred_bird', 'true_bird'): 0, ('pred_bird', 'true_cat'): 0,
 ('pred_cat', 'true_ant'): 0, ('pred_cat', 'true_bird'): 1, ('pred_cat', 'true_cat'): 2}

Screenshot:

I think nested dict with tuples as key represent the data very well preserving class names and easy conversion to dataframe.
Please let me know your thoughts, I'll make the changes accordingly. Thanks!

jnothman · 2021-01-04T02:40:52Z

Re the flat dict, I was thinking to put true then pred in the keys, in line with the input args. Because it parallels the input args, I would not need "true" and "pred" explicitly there (although OTOH, your example violated my intuitions!). So more along the lines of:

C = {('ant', 'ant'): 2, ('bird', 'ant'): 0, ('cat', 'ant'): 1,
 ('ant', 'bird'): 0, ('bird', 'bird'): 0, ('cat', 'bird'): 0,
 ('ant', 'cat'): 0, ('bird', 'cat'): 1, ('cat', 'cat'): 2}

then

>>> pd.Series(C)
ant   ant     2
bird  ant     0
cat   ant     1
ant   bird    0
bird  bird    0
cat   bird    0
ant   cat     0
bird  cat     1
cat   cat     2
>>> pd.Series(C).unstack()
      ant  bird  cat
ant     2     0    0
bird    0     0    1
cat     1     0    2

shubhamdo · 2021-01-04T21:32:02Z

I've made changes to the code to replicate the output like you mentioned.

    if pprint:
        labelList = labels.tolist()
        cm_list = cm.tolist()
        cm_dict = {(labelList[j], labelList[i]): cm_list[j][i] for i in range(0, len(labelList))
                   for j in range(0, len(cm_list))}

        return cm_dict

Output:

glemaitre · 2021-01-05T09:23:31Z

@jnothman What would be the issue with nested dict to get multi-index in the dataframe? It would avoid doing the unstack.

glemaitre · 2021-01-05T14:25:52Z

In some way, I am thinking that adding this feature would be enough to close #5516 and avoid implementing #17265.

jnothman · 2021-01-11T11:48:23Z

@jnothman What would be the issue with nested dict to get multi-index in the dataframe? It would avoid doing the unstack

How would you make it intuitively clear which axis is true and which pred? Or would we rely on documentation?

jnothman · 2021-01-11T11:49:14Z

In some way, I am thinking that adding this feature would be enough to close #5516 and avoid implementing #17265.

There, reference implementations of commonly used metrics are sought... so not really. Also, scorers might be helpful there.

varunjohn786 · 2021-01-11T12:33:51Z

@jnothman What would be the issue with nested dict to get multi-index in the dataframe? It would avoid doing the unstack

How would you make it intuitively clear which axis is true and which pred? Or would we rely on documentation?

@jnothman wont this reliance towards documentation and obscurity about true and pred axis persist even with tuple key dictionary?

glemaitre · 2021-01-21T08:55:07Z

@jnothman When I thought about the nested dict, I was thinking that "Actual label" and Predicted label would the outer index that could be used to create the dataframe. In this case, you should not have ambiguity, isn't it?

tansaku · 2022-11-06T09:33:05Z

I think that a flag to get output with more labels would be a huge win - I guess we'd need to try and get this PR merged? #19190

shubhamdo added the New Feature label Dec 15, 2020

shubhamdo linked a pull request Jan 17, 2021 that will close this issue

Enhancement to Confusion Matrix Output Representation for improving readability #19012 #19190

Open

cmarmo added the module:metrics label Jan 18, 2021

glemaitre mentioned this issue Sep 9, 2021

Additional metrics in sklearn.metrics.classification_report #21000

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusion Matrix Representation / Return Value #19012

Confusion Matrix Representation / Return Value #19012

shubhamdo commented Dec 15, 2020

jnothman commented Dec 16, 2020

glemaitre commented Dec 16, 2020

shubhamdo commented Dec 16, 2020 •

edited

Loading

shubhamdo commented Dec 29, 2020

jnothman commented Dec 30, 2020 •

edited

Loading

shubhamdo commented Dec 30, 2020

yash-clear commented Jan 1, 2021

jnothman commented Jan 3, 2021 via email

shubhamdo commented Jan 3, 2021

jnothman commented Jan 4, 2021 •

edited

Loading

shubhamdo commented Jan 4, 2021 •

edited

Loading

glemaitre commented Jan 5, 2021

glemaitre commented Jan 5, 2021

jnothman commented Jan 11, 2021

jnothman commented Jan 11, 2021

varunjohn786 commented Jan 11, 2021

glemaitre commented Jan 21, 2021

tansaku commented Nov 6, 2022

Confusion Matrix Representation / Return Value #19012

Confusion Matrix Representation / Return Value #19012

Comments

shubhamdo commented Dec 15, 2020

Describe the workflow you want to enable

Describe your proposed solution

jnothman commented Dec 16, 2020

glemaitre commented Dec 16, 2020

shubhamdo commented Dec 16, 2020 • edited Loading

shubhamdo commented Dec 29, 2020

jnothman commented Dec 30, 2020 • edited Loading

shubhamdo commented Dec 30, 2020

yash-clear commented Jan 1, 2021

plot the confusion matrix

jnothman commented Jan 3, 2021 via email

shubhamdo commented Jan 3, 2021

jnothman commented Jan 4, 2021 • edited Loading

shubhamdo commented Jan 4, 2021 • edited Loading

glemaitre commented Jan 5, 2021

glemaitre commented Jan 5, 2021

jnothman commented Jan 11, 2021

jnothman commented Jan 11, 2021

varunjohn786 commented Jan 11, 2021

glemaitre commented Jan 21, 2021

tansaku commented Nov 6, 2022

shubhamdo commented Dec 16, 2020 •

edited

Loading

jnothman commented Dec 30, 2020 •

edited

Loading

jnothman commented Jan 4, 2021 •

edited

Loading

shubhamdo commented Jan 4, 2021 •

edited

Loading