Skip to content

[MRG] Ensure that classification metrics support string label #2170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

arjoly
Copy link
Member

@arjoly arjoly commented Jul 19, 2013

Since most estimators accept string as input, I make sure that most metrics will do.

This pr doesn't include metrics with y_score.

This should fix #2168, #1989

@arjoly
Copy link
Member Author

arjoly commented Jul 19, 2013

Damn, this works well on my laptop.
I don't understand what happens with travis. :-(

def test_classification_report_multiclass_with_string_label():
y_true, y_pred, _ = make_prediction(binary=False)

y_true = y_true.astype(np.str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the bug: this produces an array of dtype='|S1' so the labels all get truncated to a single char. I'm on it.

@arjoly
Copy link
Member Author

arjoly commented Jul 25, 2013

Ready for reviews!


avg / total 0.62 0.61 0.56 75
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was written twice

    # print classification report with label detection
    expected_report = """\
             precision    recall  f1-score   support

          0       0.82      0.92      0.87        25
          1       0.56      0.17      0.26        30
          2       0.47      0.90      0.62        20

avg / total       0.62      0.61      0.56        75
"""
    expected_report = """\
             precision    recall  f1-score   support

          0       0.83      0.79      0.81        24
          1       0.33      0.10      0.15        31
          2       0.42      0.90      0.57        20

avg / total       0.51      0.53      0.47        75
"""

The first expected report is erased by the second one.

@amueller
Copy link
Member

LGTM +1

@arjoly
Copy link
Member Author

arjoly commented Jul 25, 2013

Should fix #1989

@ogrisel
Copy link
Member

ogrisel commented Jul 25, 2013

Looks good to me as well be this branch require a rebase / conflicts fix before being able to merge.

@amueller
Copy link
Member

@arjoly do you want to rebase or should I?

@arjoly
Copy link
Member Author

arjoly commented Jul 25, 2013

I'll do it

@arjoly
Copy link
Member Author

arjoly commented Jul 25, 2013

@amueller rebased on top of master

@amueller
Copy link
Member

fail ;)

@arjoly
Copy link
Member Author

arjoly commented Jul 25, 2013

Hope it will good now.

@arjoly
Copy link
Member Author

arjoly commented Jul 25, 2013

@amueller it works ;-)

@amueller
Copy link
Member

merged by rebase. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Problem with String Classes and Scorers
4 participants