-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Raise errors on float->int conversions with NaN, inf #14412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yeah, I have been wondering about that myself. But on the other hand, blowing up casting options is a bit annoying. For integers specifically, it could be part of a error flags or so (i.e. if we had integer overflow errors, that could clearly handle this part). |
I agree about raising an error. Here is a simpler example to follow: import numpy as np
arr2D = [
[9,9,9],
[9,np.nan,9],
[9,9,9]
]
arr2D = np.array(arr2D)
arr2D = arr2D.astype('int32')
>>> arr2D
array([[ 9, 9, 9],
[ 9, -2147483648, 9],
[ 9, 9, 9]], dtype=int32)
I think this is C's int minimum? |
@aiqc that returns undefined behaviour (as defind by the C standard). It is a bit scary/annoying, but right now you must not rely on the minimum value to be returned. Typical values are either xref gh-19248. If we would not silence floating point warnings would at least warn you about this happening:
|
@seberg if it can’t be reliably parsed then that’s even more reason to raise error on dtype conversion? |
Yes, I don't mind raising an error, but I think a warning would be a good improvement. There are two things we may need to consider:
|
We are now giving a warning in this cases after gh-21437. Lets keep this one issue open about possibly raising an error. Or even sanitizing the output. |
|
Pinging this, as we discovered some failures in our code due to:
returning an architecture-dependent value ( The warning message added earlier is a start, but it can be hard to track down just where in the code that warning is coming from. If one could make float->int casts always, that would at least make it very obvious what needed to be fixed in the user code. |
That example fails in new NumPy versions @parejkoj. Although, not if you use e.g. |
Oh, I'm glad that that's now a failure! We're currently on numpy version 1.26.4, and working on compatibility checks for 2.0. Is there a PR with this particular fix I could look at? I'm not sure what you mean by "not if you use |
If you pass a NumPy Nan, rather than the Python float nan it will cast normally, and normal casts are still allowed/unchanged. |
With
ndarray.astype(..., casting="unsafe")
float arrays with NaN/inf will be converted tonp.iinfo(dtype).min/max
,(there are also some inconsistencies there cf #6109). This is very bad in practical applications as output data will be wrong by orders of magnitude. Other
casting
options simply disallow casting float to int.At the same time, casting from
dtype=np.object
works as expected,It could be useful to have some additional
casting
option allowing to convert int to float, but would error on NaN or inf. This could be done manually,but having it in numpy would be useful.
I also wonder if there is really a case when not raising on such conversions is meaningful (even with
casting='unsafe'
). For instance pandas does this conversions as expected (using numpy dtypes),Numpy/Python version information:
The text was updated successfully, but these errors were encountered: