Skip to content

EHN allow scorers to set addtional parameter of scoring function #17962

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 20 commits into from

Conversation

glemaitre
Copy link
Member

closes #17942

Add a parameter copy to get_scorer to whether or not make a copy of a custom metric.
In addition, internal scikit-learn metrics will be copied.

@glemaitre
Copy link
Member Author

ping @adrinjalali @thomasjpfan

I think this is a compromise that is still allowing to not make a copy for a custom scorer.

@glemaitre
Copy link
Member Author

I will check the different call to get_scorer internally and make sure that we trigger a copy.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we consider this a bug so we do not need to introduce the copy parameter?

get_scorer can remain a noop if it is a callable.

@glemaitre
Copy link
Member Author

It would be fine with me.

@thomasjpfan
Copy link
Member

I was thinking doing deepcopy for only the scorers:

scorer = deepcopy(SCORERS[scoring])

and a noop for callables.

thomasjpfan
thomasjpfan previously approved these changes Jul 31, 2020
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment

Otherwise LGTM!

@jnothman
Copy link
Member

jnothman commented Aug 1, 2020

Please update the title.
I'm +0.5 for this. I don't see why a user should be getting and modifying a scorer. Though I guess this reduces surprising bugs if they do.

@glemaitre
Copy link
Member Author

I'm +0.5 for this. I don't see why a user should be getting and modifying a scorer. Though I guess this reduces surprising bugs if they do.

I don't like to not have a full +1 from @jnothman :)

More seriously, in our internal use case, we would have a user giving us roc_auc in GridSearchCV which would pick the roc_auc_scorer. However, to be sure that everything goes fine, we would need to set pos_label (and find the right columns in the predictions). So we would like to modify the scorer. Another use case (again internally), when developing the convenience functions to return the set of possible scorers, we will need to modify some parameters and modify them.

So some potential alternative:

  • let get_scorer returning a copy;
  • make the deep copy at the level where we will need to modify the scorer;
  • to have a make_scorer that can take a scorer already and returned a new object with modified parameters.

@glemaitre glemaitre changed the title FIX/ENH add copy parameter to get_scorer FIX make sure get_scorer returns a deepcopy of internal sklearn scorers Aug 3, 2020
@jnothman
Copy link
Member

jnothman commented Aug 3, 2020 via email

@glemaitre
Copy link
Member Author

OK this seems reasonable. I will propose an alternative that will expose a public API.

@glemaitre glemaitre changed the title FIX make sure get_scorer returns a deepcopy of internal sklearn scorers EHN allow scorers to set addtional parameter of scoring function Aug 4, 2020
@@ -165,6 +167,38 @@ def _factory_args(self):
"""Return non-default make_scorer arguments for repr."""
return ""

def set_kwargs(self, **kwargs):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jnothman is it this type of interface that you had in mind?

@adrinjalali
Copy link
Member

Please update the title.
I'm +0.5 for this. I don't see why a user should be getting and modifying a scorer. Though I guess this reduces surprising bugs if they do.

With sample props, the scorer defines which parameter should be passed to it, and that's something which is changed by the user and it shouldn't be changed on the singleton object.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments here.

Yes, you're right @adrinjalali, that's a valid use-case for get_scorer copying. But it's not a current use-case.

I'm not yet sure if get_scorer(name, **kw) is motivated. When do we need it? Also, using ** blocks the option to extend get_scorer in other ways, so I'd be careful about this particular signature.

And, while I didn't think of it previously, a minimum sufficient change to the scorer objects would have been to rename the _kwargs attribute to kwargs.

@glemaitre
Copy link
Member Author

I'm not yet sure if get_scorer(name, **kw) is motivated. When do we need it? Also, using ** blocks the option to extend get_scorer in other ways, so I'd be careful about this particular signature.

I need to go back to where I need it. I think that I am getting confused :)

And, while I didn't think of it previously, a minimum sufficient change to the scorer objects would have been to rename the _kwargs attribute to kwargs.

True.

@glemaitre
Copy link
Member Author

OK let me put the steps that make me think that we need to have a copy (I am struggling myself to follow them):

  1. A user will pass scoring='roc_auc' in the GridSearchCV.
  2. This roc_auc will call the scorer roc_auc_scorer created as make_scorer(roc_auc_score, ...).
  3. This scorer is a _ThresholdScorer which might call y_pred and because roc_auc_score takes only a single column, it will need to infer the positive label which will be y_pred[:, 1].
  4. However, in the case of passing y_true as a string dtype, we will need to binarize it (which is currently using np.unique).

At this stage, there is a potential bug because the encoding of y_true is not necessarily corresponding to the encoding induced by y_pred. The ambiguity could be removed by passing a parameter pos_label. It is what we try there: #17704

  1. However, we are lucky because roc_auc_score is a symmetric metric. Even if pos_label is not we can manage to compute the score. We just need to be sure to select the column of y_pred which corresponds to the positive label used to encode y_true.

So #17704 can resolve this issue as follow:
Detect that we have symmetric scores (maybe using some score properties metadata). In this case, _ThresholdScorer and _ProbaScorer can safely modify their attribute self._kwargs by imposing a pos_label, and ensuring the right encoding of y_true by passing pos_label to self._score_func.

However, by self-mutating, we are actually mutating SCORER["roc_auc"] and thus sklearn.metrics.roc_auc_scorer.

We actually have this issue when running the test suite: we import roc_auc_scorer and mutate it for one of the dataset and it will be failing when using in another test because pos_label will be set (and the debugging was not fun :)).

When I try to resolve these problems, I have the impression to be at the front of a chicken-egg problem (with many chickens and many eggs :S):

  1. Solving the issue with pos_label in roc_auc_score: this one is easy because we can just introduce the parameter and raise an error when we the pos_label should be provided.
  2. Solving the previous issue will introduce a regression since we are currently supporting (sometimes wrongly) roc_auc without specifying pos_label in GridSearchCV. If we want to support it, we will need to: (i) detect if a score is symmetric and (ii) self-mutate a scorer.
  3. The previous self-mutation will be an issue if we don't copy when we get the scorer with get_scorer.
  4. I am not sure what would be the best way to add metadata to scores such that we know some of the properties (I was going to propose something like this: https://github.com/scikit-learn/scikit-learn/pull/17930/files#diff-e907207584273858caf088b432e38d04R772)

Sorry for the long post but I tried to summarize the best I could. I already annoyed @thomasjpfan with a call where we kind of concluded that the issues could be decoupled but it does not seem to be that easy :) It seems that we should solve the issue in the reverse order that I pointed out. Since solving these issues might introduce some new API changes, I would really appreciate any advice and thoughts regarding these problems :)

@jnothman
Copy link
Member

jnothman commented Aug 5, 2020 via email

@thomasjpfan
Copy link
Member

thomasjpfan commented Aug 5, 2020

Personally, I would prefer to continue the assumption that scorers were immutable (if possible).

If the goal is to resolve #17704, we do not need to modify get_scorers. In _score, we can create a copy of kwargs in _score, add pos_label to it, and then pass it along to self._score_func. This way the scorer object renames unchanged.

@thomasjpfan thomasjpfan dismissed their stale review August 5, 2020 18:09

There may be another way to resolve #17704 while keeping get_scorer the same.

@adrinjalali
Copy link
Member

I see scorers as an object which have a state, and the sate can also be data dependent, especially with custom scorers which deal with column or sample meta-data.

I agree one solution is to make everybody to use make_scorer to get a scorer, but I don't find it intuitive and I believe it'll result in silent or loud bugs/issues by the users.

And I know this is about get_scorer, but I do remember people hacking the dictionary and adding custom scorers to the dict to make their code look nicer.

On the other hand, could you (@thomasjpfan , @jnothman ) explain why immutability of scorers is so important?

@jnothman
Copy link
Member

jnothman commented Aug 12, 2020 via email

@glemaitre
Copy link
Member Author

glemaitre commented Aug 12, 2020 via email

@jnothman
Copy link
Member

jnothman commented Aug 12, 2020 via email

@jnothman
Copy link
Member

jnothman commented Aug 12, 2020 via email

@glemaitre
Copy link
Member Author

Indeed, get_params and set_params would be enough.

@glemaitre
Copy link
Member Author

So, we could have set_params and get_params to deal with the scoring_func parameters. I don't think these 2 methods should allow changing either the scoring_func neither the sign. Changing those should require a call to make_scorer.

I'm not yet sure if get_scorer(name, **kw) is motivated. When do we need it? Also, using ** blocks the option to extend get_scorer in other ways, so I'd be careful about this particular signature.

From what I was drafting in #18141, it seems that there is a use case in order to be more convenient:

    multi_class = "multinomial"
    labels = np.unique(y)
    scoring = get_scorer("neg_log_loss", labels=labels)
    random_state = 1

    model = LogisticRegressionCV(
        multi_class=multi_class, scoring=scoring, random_state=random_state
    )
    model.fit(X, y)

if we don't allow **kwargs one needs to have an extra line to call set_params on the returned scorer.
Would it not still be possible to extend the get_scorer using keywords-only argument when introducing new arguments?

@adrinjalali
Copy link
Member

I think what your talking about here should be framed not as making them
estimators, but making them parameterized like estimators and kernels.

Parameterized is a much better way of conveying what I meant by "having a state".
Does that mean you're leaning towards accepting this PR or a variation of it?

Base automatically changed from master to main January 22, 2021 10:52
@thomasjpfan
Copy link
Member

Going back to get this, I agree with #17962 (comment). This allows for a easy transition from using a string for scoring and then creating your own custom scorer.

If a user were to use make_scorer, I think it would be okay to have them call make_scorer again with the extra kwargs. This way we do not need the set_kwargs which keeps the scorers immutable.

@adrinjalali
Copy link
Member

@glemaitre what do we need to do here? in the meantime, we've also deprecated SCORERS and users always get a copy, and set_score_request mutates self.

@glemaitre
Copy link
Member Author

I'm not sure anymore if I agree with myself of 4 years ago. It seems that I expected get_scorer to do the job of make_scorer in some way but without having to pass the function but only using the string.

I think we can close it this PR since the state of the scorers is better nowadays. We probably more work on the API scorer side.

@glemaitre glemaitre closed this May 18, 2024
@glemaitre
Copy link
Member Author

Basically, I think it was the idea that I had in #18141

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should get_scorer return a deep copy of the scorer object
4 participants