-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Enhancement to Confusion Matrix Output Representation for improving readability #19012 #19190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR.
sklearn/metrics/_classification.py
Outdated
@@ -249,6 +249,10 @@ def confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None, | |||
conditions or all the population. If None, confusion matrix will not be | |||
normalized. | |||
|
|||
pprint : bool, default=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's call this as_dict
?
sklearn/metrics/_classification.py
Outdated
@@ -257,6 +261,14 @@ def confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None, | |||
samples with true label being i-th class | |||
and predicted label being j-th class. | |||
|
|||
Or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't valid numpydoc. The types need to be mentioned all on the first line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Changed parameter name from pprint --> as_dict()
- Changed the Testing Function in test_classification.py
- Changed the Docstring for the function, added explaination of Series usage
Not sure about the numpydoc, I have changed it please review.
- Or should I mention it as --> tuple[ndarry, dict['true_class','pred_class']] ?
sklearn/metrics/_classification.py
Outdated
@@ -249,6 +249,10 @@ def confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None, | |||
conditions or all the population. If None, confusion matrix will not be | |||
normalized. | |||
|
|||
pprint : bool, default=False | |||
Returns a confusion matrix in dict representation with labels as keys | |||
('true', 'pred') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be worth briefly noting the usage with pandas and unstack.
1. Changed parameter name from pprint --> as_dict() 2. Changed the Testing Function in test_classification.py 3. Tested 4. Changed the Docstring for the function, added explaination of Series usage
I guess this PR has stalled? |
Reference Issues/PRs
Fixes #19012
What does this implement/fix? Explain your changes.
When you have multiple levels you can have difficulty reading the ndarray, associating the levels with the True and Predicted values. It is an enhancement to the output of confusion matrix function, better representing the true and predicted values for multilevel classes.
Any other comments?