Skip to content

Confusion Matrix Representation / Return Value #19012

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
shubhamdo opened this issue Dec 15, 2020 · 18 comments · May be fixed by #19190
Open

Confusion Matrix Representation / Return Value #19012

shubhamdo opened this issue Dec 15, 2020 · 18 comments · May be fixed by #19190

Comments

@shubhamdo
Copy link

Describe the workflow you want to enable

An enhancement to the output of confusion matrix function, better representing the true and predicted values for multilevel classes.

  • i.e. Current Representation with code:
    from sklearn.metrics import confusion_matrix
    y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
    y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
    confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
    Output:
    array([[2, 0, 0],
    [0, 0, 1],
    [1, 0, 2]])

Describe your proposed solution

When you have multiple levels you can have difficulty reading the ndarray, associating the levels with the True and Predicted values.

Proposed solution should look similar to the table below, providing better readability of the confusion matrix.

    Predicted Value  
  Levels ant bird cat
True ant 2 0 0
Value bird 0 0 1
  cat 1 0 2

Possible Solutions:

  1. Provide a parameter to prettyprint the matrix. printMatrix [type:bool]
  2. Include another parameter to return ndarray, index as true_values, columns as predicted_values
    For example:
    cm = confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])
    index=["true:ant", "true:bird", "true:cat"]
    columns=["pred:ant", "pred:bird", "pred:cat"]
    return cm, index, columns
    Which can be easily converted into a dataframe for further use
@jnothman
Copy link
Member

I agree that the current output is unnecessarily difficult, and the confusion matrix is naturally portrayed as a dataframe... not sure if this is something we would consider introducing as a breaking change, for which we would require a hard dep on pandas...

@glemaitre
Copy link
Member

Could we return a Python dict or a Bunch to go toward this direction but without a hard dependency?

@shubhamdo
Copy link
Author

shubhamdo commented Dec 16, 2020

@jnothman @glemaitre Actually I'm suggesting an alternative output which shouldn't be affecting dependent codes/breaking changes.

For example: By adding a parameter, if parameter prettyPrint = True then,
return cm, index, columns or to an extendible generic object such as dict, custom structured object. (I'm thinking what that could be, not adding a dep on pandas. I'll get back soon on this.)
and will print the confusion matrix like a table representation as mentioned earlier.

By default when parameter prettyPrint = False then, standard output as we see at present.
return cm
i.e cm = ndarray()
array([[2, 0, 0],
[0, 0, 1],
[1, 0, 2]])

I would be working on the issue, if you think it's a viable issue/feature.

@shubhamdo
Copy link
Author

Hi, I was suggesting the output as shown in the screenshot below. I have added a parameter to the existing function called as pprint, so if it's True will show the output matrix. The return object would remain the same (i.e. ndarray).

Example:

from sklearn.metrics._classification import confusion_matrix
y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
cm = confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"], pprint=True)

Output:
output_sklearn

Function:

def confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None,
                     normalize=None, pprint=False):
........
################# To pretty print the confusion matrix ###############

 if pprint:
        labelList = labels.tolist()

        cm_arr = cm.tolist()
        for i in range(0, len(cm_arr)):
            cm_arr[i].insert(0, labelList[i])
            if i == (len(cm_arr) // 2):
                cm_arr[i].insert(0, "Predicted")
            else:
                cm_arr[i].insert(0, "         ")

        labelList.insert(0, "Lvl")
        labelList.insert(0, "           ")
        a = [{labelList[j]: i[j] for j in range(0, len(i))} for i in cm_arr]

        myList = [labelList]
        for item in a:
            myList.append([str(item[col] if item[col] is not None else '') for col in labelList])

        # Gets the column size for better printing
        colSize = [max(map(len, col)) for col in zip(*myList)]

        # To add seperations based on the max len of the columns
        formatStr = ' | '.join(["{{:<{}}}".format(i) for i in colSize])

        # Seperating line of the header
        myList.insert(1, [' ' * colSize[i] if i == 0 else '-' * colSize[i] for i in range(0, len(colSize))])

        print("              *** Confusion Matrix ***")
        print("                         True")

        # Add to the formatted structure
        for item in myList:
            print(formatStr.format(*item))

############################################################
    with np.errstate(all='ignore'):
        if normalize == 'true':
            cm = cm / cm.sum(axis=1, keepdims=True)
        elif normalize == 'pred':
            cm = cm / cm.sum(axis=0, keepdims=True)
        elif normalize == 'all':
            cm = cm / cm.sum()
        cm = np.nan_to_num(cm)

    return cm```

@jnothman
Copy link
Member

jnothman commented Dec 30, 2020

The rows should be true values, the columns predicted.

I don't think we want pprint like that, although I admit that whether we return a dict or a DataFrame or index and columns it remains tricky to communicate which is true and which is predicted.

@shubhamdo
Copy link
Author

@jnothman @glemaitre taking your comments in due consideration, I've added another change so that it returns an output in a default data type i.e. dict() Eliminating need for hard dependencies, increasing usability with any other 3rd party libs too.

Please refer to the example code and screenshots. Thanks!

Example:
from sklearn.metrics._classification import confusion_matrix
y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
cm = confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"], pprint=True)

Code: Output as a dict()

    if pprint:
        labelList = labels.tolist()

        cm_lol = cm.tolist()
        cm_dict = {str(labelList[j]): {str(labelList[i]): cm_lol[i][j] for i in
                                                 range(0, len(labelList))} for j in range(0, len(cm_lol))}

        return cm_dict

Output w/o pprint(False):

array([[2, 0, 0],
       [0, 0, 1],
       [1, 0, 2]], dtype=int64)

Output w/ pprint(True):

{'ant': {'ant': 2, 'bird': 0, 'cat': 1}, 
'bird': {'ant': 0, 'bird': 0, 'cat': 0},
 'cat': {'ant': 0, 'bird': 1, 'cat': 2}} 

For this solution changes required:

def confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None,
                     normalize=None, pprint=False):
....... 
 if pprint: # Logic
      return cm_dict #As the suggested output
.....

return cm

Option 2: For better understanding the true and pred values.

    if pprint:
        labelList = labels.tolist()

        cm_lol = cm.tolist()
        cm_dict = {"pred_" + str(labelList[j]): {"true_" + str(labelList[i]): cm_lol[i][j] for i in
                                                 range(0, len(labelList))} for j in range(0, len(cm_lol))}

        return cm_dict

Output:

{'pred_ant': {'true_ant': 2, 'true_bird': 0, 'true_cat': 1},
 'pred_bird': {'true_ant': 0, 'true_bird': 0, 'true_cat': 0}, 
'pred_cat': {'true_ant': 0, 'true_bird': 1, 'true_cat': 2}}

@yash-clear
Copy link

@shubhamdo
confusion_matrix

`from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

y_true = ["cat", "ant", "cat", "cat", "ant", "bird"]
y_pred = ["ant", "ant", "cat", "cat", "ant", "cat"]
cm=confusion_matrix(y_true, y_pred, labels=["ant", "bird", "cat"])

plot the confusion matrix

f,ax = plt.subplots(figsize=(10, 10))
sns.heatmap(cm, annot=True, linewidths=0.01,cmap=sns.cubehelix_palette(8),fmt= '.1f',ax=ax)
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.title("Confusion Matrix")
plt.show()`

@jnothman
Copy link
Member

jnothman commented Jan 3, 2021 via email

@shubhamdo
Copy link
Author

Returning tuples as keys within the nested dict work well with casting to DataFrame, please refer below dict
structure representing the same.

{('pred', 'ant'): {('true', 'ant'): 2, ('true', 'bird'): 0, ('true', 'cat'): 1},
('pred', 'bird'): {('true', 'ant'): 0, ('true', 'bird'): 0, ('true', 'cat'): 0}, 
('pred', 'cat'): {('true', 'ant'): 0, ('true', 'bird'): 1, ('true', 'cat'): 2}}

Screenshot:
nested_dict_w_tupl_key

Another Option: Flat Dict
Here, I think the problem with flat dict option is the casting the dataframe will need extra steps.
If we need one step conversion then the nested dict with tuples as keys is good as seen above.

Flat Dict:

{('pred_ant', 'true_ant'): 2, ('pred_ant', 'true_bird'): 0, ('pred_ant', 'true_cat'): 1,
 ('pred_bird', 'true_ant'): 0, ('pred_bird', 'true_bird'): 0, ('pred_bird', 'true_cat'): 0,
 ('pred_cat', 'true_ant'): 0, ('pred_cat', 'true_bird'): 1, ('pred_cat', 'true_cat'): 2}

Screenshot:
flat_dict

I think nested dict with tuples as key represent the data very well preserving class names and easy conversion to dataframe.
Please let me know your thoughts, I'll make the changes accordingly. Thanks!

@jnothman
Copy link
Member

jnothman commented Jan 4, 2021

Re the flat dict, I was thinking to put true then pred in the keys, in line with the input args. Because it parallels the input args, I would not need "true" and "pred" explicitly there (although OTOH, your example violated my intuitions!). So more along the lines of:

C = {('ant', 'ant'): 2, ('bird', 'ant'): 0, ('cat', 'ant'): 1,
 ('ant', 'bird'): 0, ('bird', 'bird'): 0, ('cat', 'bird'): 0,
 ('ant', 'cat'): 0, ('bird', 'cat'): 1, ('cat', 'cat'): 2}

then

>>> pd.Series(C)
ant   ant     2
bird  ant     0
cat   ant     1
ant   bird    0
bird  bird    0
cat   bird    0
ant   cat     0
bird  cat     1
cat   cat     2
>>> pd.Series(C).unstack()
      ant  bird  cat
ant     2     0    0
bird    0     0    1
cat     1     0    2

@shubhamdo
Copy link
Author

shubhamdo commented Jan 4, 2021

I've made changes to the code to replicate the output like you mentioned.

    if pprint:
        labelList = labels.tolist()
        cm_list = cm.tolist()
        cm_dict = {(labelList[j], labelList[i]): cm_list[j][i] for i in range(0, len(labelList))
                   for j in range(0, len(cm_list))}

        return cm_dict

Output:
final_output

@glemaitre
Copy link
Member

@jnothman What would be the issue with nested dict to get multi-index in the dataframe? It would avoid doing the unstack.

@glemaitre
Copy link
Member

In some way, I am thinking that adding this feature would be enough to close #5516 and avoid implementing #17265.

@jnothman
Copy link
Member

@jnothman What would be the issue with nested dict to get multi-index in the dataframe? It would avoid doing the unstack

How would you make it intuitively clear which axis is true and which pred? Or would we rely on documentation?

@jnothman
Copy link
Member

In some way, I am thinking that adding this feature would be enough to close #5516 and avoid implementing #17265.

There, reference implementations of commonly used metrics are sought... so not really. Also, scorers might be helpful there.

@varunjohn786
Copy link

@jnothman What would be the issue with nested dict to get multi-index in the dataframe? It would avoid doing the unstack

How would you make it intuitively clear which axis is true and which pred? Or would we rely on documentation?

@jnothman wont this reliance towards documentation and obscurity about true and pred axis persist even with tuple key dictionary?

@glemaitre
Copy link
Member

@jnothman When I thought about the nested dict, I was thinking that "Actual label" and Predicted label would the outer index that could be used to create the dataframe. In this case, you should not have ambiguity, isn't it?

@tansaku
Copy link

tansaku commented Nov 6, 2022

I think that a flag to get output with more labels would be a huge win - I guess we'd need to try and get this PR merged? #19190

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants