-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
StandardScaler fit overflows on float16 #13007
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If adding dtype on the mean calculation is sufficient, that's probably a
good idea. Pull request?
|
baluyotraf
added a commit
to baluyotraf/scikit-learn
that referenced
this issue
Jan 18, 2019
baluyotraf
added a commit
to baluyotraf/scikit-learn
that referenced
this issue
Jan 18, 2019
baluyotraf
added a commit
to baluyotraf/scikit-learn
that referenced
this issue
Jan 19, 2019
baluyotraf
added a commit
to baluyotraf/scikit-learn
that referenced
this issue
Jan 19, 2019
…tils.extmath. Also fixed some line lengths to fit the 80 limit (scikit-learn#13007)
baluyotraf
added a commit
to baluyotraf/scikit-learn
that referenced
this issue
Jan 20, 2019
baluyotraf
added a commit
to baluyotraf/scikit-learn
that referenced
this issue
Jan 21, 2019
…ult with respect to their precisions (scikit-learn#13007)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
When using StandardScaler on a large float16 numpy array the mean and std calculation overflows. I can convert the array to a larger precision but when working with a larger dataset the memory saved by using float16 on smaller numbers kind of matter. The error is mostly on numpy. Adding the dtype on the mean/std calculation does it but I'm not sure if that how people here would like to do it.
Steps/Code to Reproduce
Expected Results
The normalized array
Actual Results
Versions
The text was updated successfully, but these errors were encountered: