Skip to content

[MRG] Add balanced accuracy score in metrics #5588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

TTRh
Copy link

@TTRh TTRh commented Oct 24, 2015

I work on this one : #3506 (adding balanced accuracy score) during the sprint. It's my first contribution.
There was already 3 PRs on that issue but not completed : #4300 #3929 #3511

I implement a simple version which works only for binary classification case. I also add test and doc inputs.

Thanks for the feed back


# convert yt, yp into index
y_pred = np.array([label_to_ind.get(x, n_labels + 1) for x in y_pred])
y_true = np.array([label_to_ind.get(x, n_labels + 1) for x in y_true])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can re-use the confusion matrix here. What do you think?

@arjoly
Copy link
Member

arjoly commented Oct 24, 2015

I would add some intuition about why this metrics is useful, especially on imbalanced datasets.

@arjoly
Copy link
Member

arjoly commented Oct 24, 2015

If the above comments are addressed, I am +1! Thanks @TTRh !

@TTRh
Copy link
Author

TTRh commented Oct 25, 2015

Thanks for the comments !

  • I use confusion matrix to compute balance accuracy as proposed.
  • I change weight parameter to balance. I hope it is better, i think it could be great to have this parameter to adapt this score metrics.
  • For the user guide part, should i add a paragraph input as accuracy score or confusion matrix or just add words in binary classification section ?

@arjoly
Copy link
Member

arjoly commented Oct 25, 2015

For the user guide part, should i add a paragraph input as accuracy score or confusion matrix or just add words in binary classification section ?

+1 for a separate section.

@duboism
Copy link

duboism commented Jan 8, 2016

Any news on this PR? The code seems OK (but I don't have much experience). Two remarks:

  • The code should check for special cases like tp == 0 or pos == 0 (maybe using _prf_divide)
  • There is a format issue in sklearn/metrics/classification.py on lines 245-246 (missing space around ,)

@TTRh
Copy link
Author

TTRh commented Jan 8, 2016

Hi, sorry i forgot that one, i should also add a separate section in the user guide. I will update this PR regarding comments.

@TTRh TTRh force-pushed the balanced_accuracy_score branch from fe6efaa to b7e8229 Compare January 11, 2016 20:13
@TTRh
Copy link
Author

TTRh commented Jan 12, 2016

I just added a separate section in user guide. Also i'm thinking about extend balanced accuracy to multiclass classification problems by taking balanced accuracy as the recall for each class averaged over the number of classes. WDYT @arjoly ?
Thanks !

@duboism
Copy link

duboism commented Jan 13, 2016

@TTRh Do you have example of such an extension?

@jnothman
Copy link
Member

Btw, there's been yet another attempt at this enhancement in #6752

@lesteve
Copy link
Member

lesteve commented Oct 18, 2017

Closed by #8066.

@lesteve lesteve closed this Oct 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants