Skip to content

Deprecated negative valued scorers #6028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

lvrzhn
Copy link

@lvrzhn lvrzhn commented Dec 14, 2015

  1. deprecated X = mean_squared_error, mean_absolute_error, median_absolute_error, log_loss, when called, print "Function returns negative of X so that greater=better. This function is deprecated in version
    0.18 and will be removed in a future version 0.20. Use neg_* instead"

  2. replaced X_scorer by neg_X_scorer throughout scikit-learn source code as well as tests. Documented in doc/modules/model_evaluation.rst

Still need to do:
3) add a new test that checks if you use deprecated one you get deprecation warning and same behavior

@amueller
Copy link
Member

fixes #5023.
Thanks @lvrzhn

'mean_absolute_error' :func:`metrics.mean_absolute_error`
'mean_squared_error' :func:`metrics.mean_squared_error`
'median_absolute_error' :func:`metrics.median_absolute_error`
'neg_mean_absolute_error' :func:`metrics.mean_absolute_error` Negative MAE returned so greater=better
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this explanation is the best possible but I don't have a better one ;)
\

@amueller
Copy link
Member

So the problem here is that deprecate can deprecate classes or functions, but the scorers are actually instances of classes, that is they are objects.

The way I propose to solve this, is to deprecate the __call__ method of these objects.
The way to do this is

message = "Function returns negative %s so that greater=better. This function is deprecated in version 0.18 and will be removed in version 0.20. Use neg_mean_squared_error instead"

mean_squared_error_scorer.__call__ = deprecated(message % "MSE")(mean_squared_error_scorer.__call__)

above that you have to define a mean_squared_error_scorer as before.

@amueller
Copy link
Member

(I also defined the message as an extra variable so it is in the source code only once.)

@amueller
Copy link
Member

hm... I just realized that this object can then no longer be pickled. So this solution doesn't actually work.

Maybe @ogrisel @agramfort @GaelVaroquaux or @jnothman have ideas how to solve this?

My best idea at the moment would be to add an additional "deprecation" parameter that takes a message, and the scorer classes store this in an attribute, and when they are called, they check if the attribute is set, and if so, raise a deprecation warning.

something like adding in _PredictScorer.__call__

if self.deprecation is not None:
    raise DeprecationWarning(self.deprecation)

and similarly in the other scorers.

@jnothman
Copy link
Member

The scorers are not part of the public API, so should not be deprecated like that (apart from the verbosity of the code where a loop should be used instead). The deprecation message should be triggered in get_scorer instead, which is public, and should be in use every time a scorer is needed (if not, we need to fix it!)

So get_scorer needs a list of scorer names for which it raises a warning suggesting that the name with the neg_ prefix should be used instead.

@jnothman
Copy link
Member

IMO, SCORERS should also be removed from sklearn.metrics.__all__, and perhaps sklearn.metrics.scorer.__all__ should be defined just for clarity of what is in the public API.

@jnothman
Copy link
Member

Hmm... sklearn/linear_model/logistic.py uses SCORERS and should use get_scorer. model_evaluation.rst also references SCORERS and should not.

When I previously needed deprecation of scorers (but not in the merged version of the code), I also introduced a list_scorers function that meant a list of scorers (ignoring those that are deprecated) was available in a machine-readable form (not just in an error message from get_scorer or the keys of SCORERS).

@amueller
Copy link
Member

It's a bit unclear currently whether SCORERS are pubic API.
It's mentioned in a note here:
http://scikit-learn.org/dev/modules/model_evaluation.html#common-cases-predefined-values

Maybe that was not a good idea and it should be private and the deprecation should be in get_scorer. Do you have a snippet of your list_scorers? Do you define a global list of deprecated scorers in the scorers module?

@amueller amueller closed this Dec 15, 2015
@jnothman
Copy link
Member

Sorry, I've lost that code. I think I may have defined a SCORER_DEPRECATION
dict mapping name to deprecation warning text

On 16 December 2015 at 02:49, Andreas Mueller notifications@github.com
wrote:

Closed #6028 #6028.


Reply to this email directly or view it on GitHub
#6028 (comment).

@jnothman
Copy link
Member

@amueller, was your closing this PR accidental? Not that we heard more from @lvrzhn, but you can't exactly blame them for going quiet after their first PR was mysteriously closed a day after submission...

@raghavrv
Copy link
Member

@lvrzhn could you kindly update us if you are still interested/available to work on this PR?

@lvrzhn
Copy link
Author

lvrzhn commented May 11, 2016

Yes, I am still interested in collaborating on this!

@raghavrv
Copy link
Member

First could you rebase please?

@raghavrv
Copy link
Member

@lvrzhn This is an important issue, would you mind if I cherry picked your commit (you will be the author) and continued on your work in a separate PR?

@lvrzhn
Copy link
Author

lvrzhn commented May 27, 2016

Hi raghavr, sorry for the delay. I wouldn't mind, please go ahead. Thanks a lot.

@amueller
Copy link
Member

replaced by #7261.

@amueller amueller closed this Sep 12, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants