-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] BUG: reset internal state of scaler before fitting #5416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fda4e58
to
6eb1ddc
Compare
The example, which is not run by CI, works for me now. |
5f7a43e
to
5ab76c9
Compare
I just double-checked and this has fixed examples/svm/plot_rbf_parameters.py indeed. |
print("Cannot fit %s a second time with different shape " | ||
"of input. Error message: %s" | ||
% (scaler.__class__.__name__, str(err))) | ||
assert False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you could leave it as a smoke test.
for scaler in scalers:
scaler.fit_transform(X)
# with a different shape, this may break the scaler unless the internal
# state is reset
scaler.fit_transform(X_2d)
I am not sure what your try/except gives you. In particular, stdout is hidden by default in nosetests and even if it was not hidden it would very likely be not very obvious to spot because of the exception stack created by assert False
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK make sense. I will just leave the ValueError
to break the test.
5ab76c9
to
49a5dbd
Compare
del self.n_samples_seen_ | ||
del self.data_min_ | ||
del self.data_max_ | ||
del self.data_range_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth trying to write slightly more generic code, something like:
attributes = [a for a in dir(self) if a.endswith('_')]
for attr in attributes:
delattr(self, attr)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could go into BaseEstimator
even. but not now ;)
This is not the style we ususally use, but I guess it is ok. This is arguably a cleaner way than overwriting in |
Any better idea? We may end up doing something similar into other estimators sooner or later. |
I am fine with the explicit |
I think it's good. Well, the common tests would be updated here: #3907 I think that includes the right test, but I'd have to double check. |
Indeed. +1 for merge on my side then. |
49a5dbd
to
c7b1a6e
Compare
From my experiments at #3907 I think most estimators do the reset on |
yeah but they don't do it in a consistent and nice way. |
Ah okay :) |
Merging and retouching the docs. |
[MRG+1] BUG: reset internal state of scaler before fitting
Fixes #5408.