Skip to content

[MRG+1] BUG: reset internal state of scaler before fitting #5416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 16, 2015

Conversation

giorgiop
Copy link
Contributor

Fixes #5408.

@giorgiop
Copy link
Contributor Author

The example, which is not run by CI, works for me now.

@lesteve
Copy link
Member

lesteve commented Oct 16, 2015

The example, which is not run by CI, works for me now.

I just double-checked and this has fixed examples/svm/plot_rbf_parameters.py indeed.

print("Cannot fit %s a second time with different shape "
"of input. Error message: %s"
% (scaler.__class__.__name__, str(err)))
assert False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you could leave it as a smoke test.

for scaler in scalers:
    scaler.fit_transform(X)
    # with a different shape, this may break the scaler unless the internal
    # state is reset
    scaler.fit_transform(X_2d)

I am not sure what your try/except gives you. In particular, stdout is hidden by default in nosetests and even if it was not hidden it would very likely be not very obvious to spot because of the exception stack created by assert False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK make sense. I will just leave the ValueError to break the test.

del self.n_samples_seen_
del self.data_min_
del self.data_max_
del self.data_range_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth trying to write slightly more generic code, something like:

attributes = [a for a in dir(self) if a.endswith('_')]

for attr in attributes:
    delattr(self, attr)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could go into BaseEstimator even. but not now ;)

@amueller
Copy link
Member

This is not the style we ususally use, but I guess it is ok. This is arguably a cleaner way than overwriting in partial_fit.

@amueller amueller added this to the 0.17 milestone Oct 16, 2015
@giorgiop
Copy link
Contributor Author

Any better idea? We may end up doing something similar into other estimators sooner or later.

@ogrisel
Copy link
Member

ogrisel commented Oct 16, 2015

I am fine with the explicit _reset as well. We should investigate why test_common did not catch this and fix it.

@amueller
Copy link
Member

I think it's good. Well, the common tests would be updated here: #3907 I think that includes the right test, but I'd have to double check.

@giorgiop giorgiop changed the title BUG: reset internal state of scaler before fitting [MRG] BUG: reset internal state of scaler before fitting Oct 16, 2015
@ogrisel
Copy link
Member

ogrisel commented Oct 16, 2015

Indeed. +1 for merge on my side then.

@ogrisel ogrisel changed the title [MRG] BUG: reset internal state of scaler before fitting [MRG+1] BUG: reset internal state of scaler before fitting Oct 16, 2015
@raghavrv
Copy link
Member

We may end up doing something similar into other estimators sooner or later.

From my experiments at #3907 I think most estimators do the reset on fit ! There still might be one or two but not more I think... :)

@amueller
Copy link
Member

yeah but they don't do it in a consistent and nice way.

@raghavrv
Copy link
Member

Ah okay :)

@amueller
Copy link
Member

Merging and retouching the docs.

amueller added a commit that referenced this pull request Oct 16, 2015
[MRG+1] BUG: reset internal state of scaler before fitting
@amueller amueller merged commit 4e64915 into scikit-learn:master Oct 16, 2015
@giorgiop giorgiop deleted the fix-scaler-refit branch November 3, 2015 12:29
@giorgiop giorgiop restored the fix-scaler-refit branch February 21, 2016 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants