Skip to content

[MRG] replace 'f1' scorer by explicit variants #2676

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

jnothman
Copy link
Member

This makes the averaging options for P/R/F scorers clearer for users, and avoids users getting binary behaviour when they shouldn't (cf. #2094 where scoring isn't used). I think this is extra important because "weighted" F1 isn't especially common in the literature, and having people report it without realising that's what it is is unhelpful to the applied ML community. This helps, IMO, towards a more explicit and robust API for binary classification metrics (cf. #2610).

It also entails a deprecation procedure for scorers, and more API there: get_scorer and list_scorers

This makes the averaging options clearer for users

It entails a deprecation procedure for scorers.
@@ -93,5 +93,7 @@
'v_measure_score',
'consensus_score',
'zero_one_loss',
'get_scorer',
'list_scorers',
'make_scorer',
'SCORERS']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I remove SCORERS from sklearn.metrics package import?

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 777ce58 on jnothman:explicit_prf_scorers into 81336ae on scikit-learn:master.

@mblondel
Copy link
Member

I think we can even deprecate the "weighted" scheme and remove it in two releases.

@jnothman
Copy link
Member Author

Well it's been the default for some time. But you're right that without this sort of patch we couldn't even consider deprecating weighted.

@mblondel
Copy link
Member

Yes, I think it's a good opportunity. I completely agree that people should not report results in the literature with this option!

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling c21f301 on jnothman:explicit_prf_scorers into 81336ae on scikit-learn:master.

@mblondel
Copy link
Member

Quick question: do f1_macro and f1_micro reduce to f1_binary when there are only 2 classes in y_true?

@jnothman
Copy link
Member Author

No, because pos_label=None: it will take the macro of two class scores. And it must be this way to avoid #2094-like problems.. But if there's somewhere it can be documented more explicitly, we could do that.

@jnothman
Copy link
Member Author

I forgot to add an option here for average='samples'. I'll fix that up soon.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 03e28a7 on jnothman:explicit_prf_scorers into 81336ae on scikit-learn:master.

@jnothman
Copy link
Member Author

Merged into #2679

@jnothman jnothman closed this Jan 18, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants