-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
labels argument of classification_report is not useful when y is a list of strings #3123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for reporting. Can you give a small example? |
Sure, sorry for not doing that! This used to work:
And this is how it fails now:
|
+1 for restoring the previous behaviour. This would be in line with #2610. |
@ogrisel This has been fixed by Joel's #4287 >>> import sklearn
>>> sklearn.__version__
0.17.dev0
>>> from sklearn.metrics import classification_report
>>> y_true = ['foo', 'bar', 'baz', 'spam']
>>> y_pred = ['foo', 'bar', 'bar', 'spam']
>>> print classification_report(y_true, y_pred, labels=['bar', 'spam'])
precision recall f1-score support
bar 0.50 1.00 0.67 1
spam 1.00 1.00 1.00 1
avg / total 0.75 1.00 0.83 2 |
Indeed, thanks for the heads up @rvraghav93, closing. |
I still has this problem and do not know how to fix it. mapper = DataFrameMapper([('AgeGroup', LabelEncoder()),('Education', LabelEncoder()),('Workclass', LabelEncoder()),('MaritalStatus', LabelEncoder()),('Occupation', LabelEncoder()),('Relationship', LabelEncoder()),('Race', LabelEncoder()),('Sex', LabelEncoder()),('Income', LabelEncoder())], df_out=True, default=None) cols = list(df_train_set.columns) df_train = mapper.fit_transform(df_train_set.copy()) df_test = mapper.transform(df_test_set.copy()) cols.remove("Income")
|
Your issue has nothing to do with classification_report, and belongs on a user forum or stack overflow, not a bug tracker. You have punctuation and whitespace in some of your labels that is causing your error |
thanks a lot, I will try to make it
… On Dec 4, 2017, at 1:23 AM, Joel Nothman ***@***.***> wrote:
You have punctuation and whitespace in some of your labels that is causing your error
|
In scikit-learn 0.14.1 it was possible to have
y_true
andy_pred
lists of strings, and pass a list of strings as alabels
argument toclassification_report
, and it worked as expected: only labels from this list were included to the report. This no longer works in scikit-learn master.It was never documented that it should work: docs say that
labels
is an "Optional list of label indices to include in the report." So, according to docs, it was undefined what happens ify
consists of strings andlabels
argument is passed - caller doesn't have correct indices to pass in this case.It seems it is better to either raise an error if
labels
is passed wheny
is not pre-transformed by a LabelEncoder, or to restore and document 0.14.1 behavior. What do you think?The text was updated successfully, but these errors were encountered: