-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Partial_fit for Preprocessing StandardScaler #5028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I plan on contributing this in the next few weeks, within my internship period at INRIA working on sklearn. |
Cool :) |
I agree. I don't see how this could best be done for |
For robust scaler, this could probably be done approximately by binning, we don't really need to do that now, though. |
I have opened #5104 |
This can be closed? |
Definitely. |
Hi - I am trying to find a way to do partial_fit on RobustScaler. Reading through this discussion and the merged code, it looks like RobustScaler doesn't have partial_fit. I need to use RobustScaler due to outliers but running into memory issues. Do you have any suggestions to do partial_fit on RobustScaler? Appreciate any help |
A cool feature for the standardscaler (And other preprocessors) would be a partial_fit option. This would work well with existing out-of-core learning tools (i.e. SGDClassifer's partial_fit). If I can't load the data into memory, I can't use the standardscaler, and I end up calculating the mean and std in batches. That works fine, but it isn't very elegant.
The text was updated successfully, but these errors were encountered: