-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
EHN allow scorers to set addtional parameter of scoring function #17962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ping @adrinjalali @thomasjpfan I think this is a compromise that is still allowing to not make a copy for a custom scorer. |
I will check the different call to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we consider this a bug so we do not need to introduce the copy
parameter?
get_scorer
can remain a noop if it is a callable.
It would be fine with me. |
I was thinking doing scorer = deepcopy(SCORERS[scoring]) and a noop for callables. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment
Otherwise LGTM!
Please update the title. |
I don't like to not have a full +1 from @jnothman :) More seriously, in our internal use case, we would have a user giving us So some potential alternative:
|
As far as I can tell, the use cases you're talking about involve modifying
the private attributes of an object. The fact that you want to continue
doing that signals to me that something's wrong with this design... While
we often don't follow it in Scikit-learn for the sake of simplicity, good
OOP design says that if we want to allow the modification of a scorer, we
should facilitate that with a public API. We could have a method like
set_kwargs, in which case get_scorer should copy so that the setting can
happen in-place, or a method like copy_with_kwargs. This change makes no
sense to me because it encourages a breach of the API.
… |
OK this seems reasonable. I will propose an alternative that will expose a public API. |
@@ -165,6 +167,38 @@ def _factory_args(self): | |||
"""Return non-default make_scorer arguments for repr.""" | |||
return "" | |||
|
|||
def set_kwargs(self, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jnothman is it this type of interface that you had in mind?
With sample props, the scorer defines which parameter should be passed to it, and that's something which is changed by the user and it shouldn't be changed on the singleton object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments here.
Yes, you're right @adrinjalali, that's a valid use-case for get_scorer
copying. But it's not a current use-case.
I'm not yet sure if get_scorer(name, **kw)
is motivated. When do we need it? Also, using **
blocks the option to extend get_scorer
in other ways, so I'd be careful about this particular signature.
And, while I didn't think of it previously, a minimum sufficient change to the scorer objects would have been to rename the _kwargs
attribute to kwargs
.
I need to go back to where I need it. I think that I am getting confused :)
True. |
OK let me put the steps that make me think that we need to have a copy (I am struggling myself to follow them):
At this stage, there is a potential bug because the encoding of
So #17704 can resolve this issue as follow: However, by self-mutating, we are actually mutating We actually have this issue when running the test suite: we import When I try to resolve these problems, I have the impression to be at the front of a chicken-egg problem (with many chickens and many eggs :S):
Sorry for the long post but I tried to summarize the best I could. I already annoyed @thomasjpfan with a call where we kind of concluded that the issues could be decoupled but it does not seem to be that easy :) It seems that we should solve the issue in the reverse order that I pointed out. Since solving these issues might introduce some new API changes, I would really appreciate any advice and thoughts regarding these problems :) |
Firstly: I remain unconvinced that there is a problem with users getting
incorrect roc auc results when using the standard scorer: our convention is
clearly to encode probabilities to match classes_, which should be sorted.
I think that the need to allow for pos_label in roc_auc_score, where we do
not explicitly require the input to come from a scikit-learn-compatible
classifier is reasonable and separate.
What you did not clarify is that in also testing the ability to use
pos_label in a custom roc_auc scorer in #17704, you took the approach of
getting the scorer named 'roc_auc' and modified it, rather than the
documented approach of using make_scorer with kwargs.
@adrinjalali, I had similarly assumed that the convention in implementing
SLEP06 was to follow the assumption that scorers were immutable, and hence
put the metadata request into make_scorer in
https://github.com/scikit-learn/enhancement_proposals/blob/677293140f8b6961ca274d0dbc6bc99e34934b50/slep006/cases_opt4.py#L14,
not as a method of the scorer.
The fact that both of you thought scorers should be mutable says something.
It indeed seems easier for a user to get a scorer and modify it (either as
a mutable object by setting state, which is more in line with scikit-learn
estimators, or as an immutable object by creating a modified copy), than to
construct it from scratch with make_scorer.
I suspect, then, that the goal of this PR should be: allow get_scorer to be
used to reconfigure a scorer. This implies making the scorer mutable and
exposing attributes or methods to mutate it, and copying in get_scorer;
alternatively it implies making the scorer more explicit immutable by
having configuration methods that create a new object (like python's string
method).
In any case, this change needs to be substantiated by updating the
documentation to demonstrate this as a friendly alternative to make_scorer.
|
Personally, I would prefer to continue the assumption that scorers were immutable (if possible). If the goal is to resolve #17704, we do not need to modify |
There may be another way to resolve #17704 while keeping get_scorer the same.
I see scorers as an object which have a state, and the sate can also be data dependent, especially with custom scorers which deal with column or sample meta-data. I agree one solution is to make everybody to use And I know this is about On the other hand, could you (@thomasjpfan , @jnothman ) explain why immutability of scorers is so important? |
I don't understand what you mean by scorers having state. I think you mean
that they are configured individually but "having state" ordinarily means
to me that during use of the object it changes the data stored in it. In
this sense, scorers are stateless, and it would be problematic if they were
not: if they were to have state modified after construction, it would imply
inconsistent measures of performance across evaluations. This is clearly
not what you mean by having a state, hence the notion that they store
configuration but are stateless.
"Hacking the dictionary" in any other framework would be called
"registering an extension" or plugin. I have no problem providing this
functionality more formally.
For me the only reason to allow the user to mutate the configuration of a
scorer is that this is consistent with BaseEstimator.set_params.
|
Would it make sense to inherit the BaseScorer from the BaseEstimator?
Since we don't have `fit` as an abstract method, we can take profit from
the `set_params` and `get_params`.
And the scorer would follow our estimator API.
…On Wed, 12 Aug 2020 at 23:07, Joel Nothman ***@***.***> wrote:
I don't understand what you mean by scorers having state. I think you mean
that they are configured individually but "having state" ordinarily means
to me that during use of the object it changes the data stored in it. In
this sense, scorers are stateless, and it would be problematic if they were
not: if they were to have state modified after construction, it would imply
inconsistent measures of performance across evaluations. This is clearly
not what you mean by having a state, hence the notion that they store
configuration but are stateless.
"Hacking the dictionary" in any other framework would be called
"registering an extension" or plugin. I have no problem providing this
functionality more formally.
For me the only reason to allow the user to mutate the configuration of a
scorer is that this is consistent with BaseEstimator.set_params.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#17962 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABY32P7TDBPR4KPENU66DILSAL77LANCNFSM4PDK4P3Q>
.
--
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/
|
I think there would be benefit in allowing scorers to be fitted on a
training set, to set labels or valid ranges. I think that's a slightly
separate issue, since it assumes that the configuration of the metric can
be estimated from data, rather than the user modifying the parameterisation
of an existing scorer explicitly. There are nice things about that
approach, and I'd be open to discussing it more explicitly.
|
Although for consistent scorers in a nested cv setting you want them to be
fitted on the entire dataset, not a training sample.
I think what your talking about here should be framed not as making them
estimators, but making them parameterized like estimators and kernels.
|
Indeed, |
So, we could have
From what I was drafting in #18141, it seems that there is a use case in order to be more convenient: multi_class = "multinomial"
labels = np.unique(y)
scoring = get_scorer("neg_log_loss", labels=labels)
random_state = 1
model = LogisticRegressionCV(
multi_class=multi_class, scoring=scoring, random_state=random_state
)
model.fit(X, y) if we don't allow |
Parameterized is a much better way of conveying what I meant by "having a state". |
Going back to get this, I agree with #17962 (comment). This allows for a easy transition from using a string for If a user were to use |
@glemaitre what do we need to do here? in the meantime, we've also deprecated |
I'm not sure anymore if I agree with myself of 4 years ago. It seems that I expected I think we can close it this PR since the state of the scorers is better nowadays. We probably more work on the API scorer side. |
Basically, I think it was the idea that I had in #18141 |
closes #17942
Add a parameter copy to
get_scorer
to whether or not make a copy of a custom metric.In addition, internal scikit-learn metrics will be copied.