-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] Clone estimator between fits in sklearn/utils/estimator_checks.py::check_supervised_y_2d #10978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I don't think this is necessary as we clone explicitly in every test. This helper function is only used in some tests, so I don't see why we should do the cloning in this one. |
During the test, the order of (relevant) events is:
The problem is that, if the estimator is not cloned between 2 and 3, then the test will fail for estimators that store and reuse a random state. This is because It is not unreasonable for an estimator to check if an internal This is important for estimators that might implement All in all, adding this single line of code should not affect the result of what this test is supposed to be checking for, and does not have enough side-effects to justify not including it. [1] https://github.com/scikit-learn/scikit-learn/blob/96a02f3/sklearn/utils/testing.py#L667 |
What test it that? It should probably be cloned unless it's testing that consecutive fits lead to identical results.
No, that's not true. We should expect that entirely. That is what the dev guide says. |
yes it has side-effects: it clones the estimator in a helper function that is unrelated to cloning an estimator. |
The test is check_supervised_y_2d. My interpretation of the test is that it is mostly checking for dimensionality of inputs/outputs and data conversion warnings.
I think that it's not a reasonable expectation for estimators to completely re-initialize everything upon a call to
I respectfully disagree. |
Why? This is not the approach scikit-learn is taking. As you can see in the development doc, fit should completely reset an estimator. |
Sorry I misunderstood the proposal, this is actually fine. I thought you wanted to modify the helper |
See http://scikit-learn.org/stable/developers/contributing.html#random-numbers
|
I've not looked through all the discussion, but I think the issue may stem
from us not currently testing that multiple fits of an estimator produce
the same model... Or do we?
|
Thanks for the pointers, @amueller. I understand the technical reasons behind the model of instantiation (even though it makes building custom estimators a bit painful) and I think that some of those conceptual models break when you introduce I simply think that having to check for each possible parameter that can be sent to |
@omtinez why does it make creating models painful? And why would it break if you introduce It's relatively simple if you adhere to the following rules: don't do anything in @jnothman I thought we did. If this is the only test that breaks, then maybe we don't? I can't see a test doing that. @omtinez can you maybe provide your estimators code? It's quite strange if it breaks only in this place, it probably violates the sklearn API in several ways. |
@amueller the complaint about making things painful is, I think, pretty difficult to address due to many decisions stemming from the technical choices being made around To illustrate the point I'm trying to make, the following code would fail this test:
Instead, if I understand the problem correctly, this is what actually needs to happen:
All that said, I think that the biggest issue here is that the test failing has nothing to do with any of this. This test is supposed to be testing whether or not a specific warning is thrown. My fix is a single line of code that does not affect the result of what the test was actually designed to check, and it should not make any previously passing estimators fail it. I surely hope not every single line of code goes under this much scrutiny, otherwise timely progress is doomed :-) |
For a more complete example of an estimator that should be able to pass this test (although fail quite a few others, which are ignored in order to make the code and idea a little more easy to understand) please see the sample estimator in #10977 |
Can you please explain why you need one call to fit to depend on whether or
not fit had been called previously? I.e. why is hasattr(self, ...) an
appropriate thing to do?
I acknowledge your critique of this review process, but while we appreciate
that the patch would stop the exception, one of our key jobs in reviewing
is to make sure the change in behaviour is well motivated. So far, we are
not convinced, because we think check_estimator *should* fail such an
implementation if the estimator is to be compatible with everything else in
the library.
|
Mainly performance reasons. This makes more sense in the context of I think an estimator should be expected to return the same predictions after Even if you still disagree with all the above, it should not be tested in this function. It is bad unit testing practices and it took me some time to track down where the issue was coming from as an estimator developer. Edited: accidentally said |
No, not really. Users are never really expected to use clone. The code example you gave should just be
There should never be a reason to do I'd take a PR that ensure this is tested somewhere else and then remove the test here. |
The current API makes things tricky for two reasons:
I think that the ML community will find more and more frequently the need to work with estimators that are able to take batches of data, aka estimators that implement The tests provided by Not making things easier for developers of estimators results in many libraries having |
I think there is little merit discussing 1) at this point. I don't understand 2). What do you mean by sharing an implementation of
There are two separate issues here: the organizations of the common tests, and the API conventions. The API conventions right now are very clear. You argue they are not good. We can talk about that. But they are very clear right now. And the common tests enforce these conventions. Right now this test test two things. I agree, that's not great. |
And with respect to "friction", what we are talking about here is having a method called def _fit(self, X, y):
# actual fitting code
pass
def _initialize_estimator(self):
# initialize attributes
pass
def fit(self, X, y):
self.initialize_estimator()
self._fit(X, y)
return self
def partial_fit(self, X, y, classes):
if self._partial_fit_first_call():
self.initialize_estimator()
self._fit(X, y)
return self If this is a very common pattern for you, create a baseclass with that code, done. |
I think this can be closed as we merged #12328. If you disagree, feel free to comment and reopen. |
Reference Issues/PRs
Fixes #10977
What does this implement/fix? Explain your changes.
Test function
check_supervised_y_2d
reuses an already fitted estimator which could leave potential side effects. This PR clones the estimator between fits to prevent any potential side effects from affecting the results of this test.Any other comments?
Please see #10977 for a sample estimator that would fail this test, even though it correctly handles what
check_supervised_y_2d
should be testing for.