Unexpected truncation when storing float to int array #8733

OliverEvans96 · 2017-03-02T22:35:35Z

I was caught offguard by the following behavior using numpy 1.12:

Current behavior

In  [1]:  a=array([1,0])
          a[0] = 1 + 1.1  
          print(a.dtype)       
          print(a)
          print(a.dtype)               
Out [1]:  int64
          [ 2  0 ]
          int64

Expected behavior

In  [2]:  a=array([1,0])
          a[0] = 1 + 1.1  
          print(a.dtype)       
          print(a)
          print(a.dtype)               
Out [2]:  int64
          [ 2.1  0. ]
          float64

It may be that this is intentional behavior, but I would like to suggest automatic conversion of integer arrays to floating point arrays when a float is saved to an element or slice of the array. I'm not sure how straighforward implementation of this feature would be.

At the very least, there ought to be a warning when truncation occurs, especially since I at no point in the above code explicity declare the dtype of A to be int64, so I (and I think most users) would expect it to default to float64.

I'm interested in opinions on the matter.

Thanks,
Oliver

The text was updated successfully, but these errors were encountered:

eric-wieser · 2017-03-02T22:44:21Z

Automatic conversion can be a very expensive operation if the array is large, so I'm definitely -1 on that. A warning seems pretty reasonable though, and it wouldn't surprise me if a mechanism already exists to do that.

OliverEvans96 · 2017-03-02T23:42:46Z

Thanks for the reply. Do you know where one would start looking in order to implement such a warning? On a philosophical note, we have two possible behaviors; one is logical from the naive user's perspective, and the other is far cheaper. Is it best to default to the user-friendly behavior and issue a warning about inefficiency, or to default to the efficient behavior and issue a warning about potential user misuse?

…

On Thu, Mar 2, 2017 at 5:44 PM, Eric Wieser ***@***.***> wrote: Automatic conversion can be a very expensive operation if the array is large, so I'm definitely -1 on that. A warning seems pretty reasonable though, and it wouldn't surprise me if a mechanism already exists to do that. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#8733 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APLYI6UGb4qFoPxkQWUTTE0lwi_vCwsoks5rh0ZLgaJpZM4MRn8N> .

eric-wieser · 2017-03-02T23:44:24Z

That choice is mostly irrelevant - the important thing is that it's best to default to the backwards-compatible behaviour.

OliverEvans96 · 2017-03-02T23:55:20Z

That makes sense. Thanks for the info!

…

On Thu, Mar 2, 2017, 6:44 PM Eric Wieser ***@***.***> wrote: That choice is mostly irrelevant - the important thing is that It's best to default to the backwards-compatible behaviour. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#8733 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APLYI6XgM0enzLVkruYLZZdjxypqveR5ks5rh1RdgaJpZM4MRn8N> .

eric-wieser · 2017-03-02T23:56:28Z

Ok, so here's where it's already implemented:

>>> a=array([1,0])
>>> np.copyto(a[0,...], 1 + 1.1)
TypeError: Cannot cast scalar from dtype('float64') to dtype('int32') according to the rule 'same_kind'

a[0,...] is a trick to return a view of a single element, rather than create a scalar

OliverEvans96 · 2017-03-03T02:37:52Z

Got it. I'm not familiar with Python.h, but I'm looking through the multiarraymodule.c file and vaguely understanding what's going on in the copyto function. It seems like the key line is 1782: ``` NPY_CASTING casting = NPY_SAME_KIND_CASTING;``` Now, do you know which function is called when I execute `a[0] = 1.1` ? I.e., where the warning should be implemented?

…

On Thu, Mar 2, 2017 at 6:56 PM, Eric Wieser ***@***.***> wrote: Ok, so here's where it's already implemented: >>> a=array([1,0])>>> np.copyto(a[0,...], 1 + 1.1)TypeError: Cannot cast scalar from dtype('float64') to dtype('int32') according to the rule 'same_kind' — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#8733 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APLYI0Ybtr9SH17z_2ldw2UKlYKWgzWBks5rh1cxgaJpZM4MRn8N> .

eric-wieser · 2017-03-03T02:41:28Z

array_assign_subscript is most like the place to look.

eric-wieser · 2017-03-03T02:47:22Z

Digging deeper into copyto, it looks like you'd want to add

if (!PyArray_CanCastTypeTo(PyArray_DESCR(src), PyArray_DESCR(dst), casting))

somewhere.

I think if you did implement this, it should take the form of np.setconfig(casting='same-kind') or some global setting like floating point warnings.

charris · 2017-03-03T03:58:39Z

We should definitely raise an error for this. ISTR that this issue may have been reported before.

pv · 2017-03-03T08:41:55Z

I expect error would have backward compatibility issues --- which is why complex->real casts don't raise either by default (you can make them raise by using warnings module to catch Complexwarning).

eric-wieser · 2017-03-03T12:17:16Z

Hmm, what's the end goal here?

If I have this code

a = np.zeros((100, 100), dtype=np.int8)
b = np.zeros((100,), dtype=float64)

a[0] = b

We want it to throw a warning, right? What should that warning encourage if this behaviour is deliberate? a[0] = b.astype(np.int8) incurs an extra copy, and np.copyto(a[0], b, casting=...) doesn't work for advanced indexing.

charris · 2017-03-03T18:36:40Z

The idea is that most such occurrences will be programming errors. That was certainly the case for complex numbers. Being explicit also has the advantage that the conversion, truncation or rounding, will be clear. The code will also be easier to understand with an explicit cast.

eric-wieser · 2017-03-03T18:54:14Z

Right, but my point is that we have no way of specifying that cast without also doing a copy, do we?

charris · 2017-03-03T19:02:14Z

I think you can use functions for that, copyto for instance. The default conversion of floats is truncation. For inplace addition you can use the np.add with the out parameter, etc.

seberg · 2017-03-03T19:06:04Z

For inplace ufunc, we already have the warning. Assignment is the only thing that is "unsafe" still. (though possibly its just a deprecation warning, don't remember the current state)

eric-wieser · 2017-03-03T19:09:24Z

out and copyto only work for views, not advanced indexing

OliverEvans96 · 2017-03-03T20:20:06Z

Perhaps this is an indication that copyto should support advanced indexing. Although I'm sure that's non-trivial and may have unintended consequences elsewhere.

…

On Fri, Mar 3, 2017, 2:09 PM Eric Wieser ***@***.***> wrote: out and copyto only work for views, not advanced indexing — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#8733 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/APLYI9vsNCBUbpDfa5tF2FURs8tIjPdJks5riGVrgaJpZM4MRn8N> .

eric-wieser · 2017-03-03T21:03:38Z

Also, I guess I'm making assumptions here about what __setitem__ does under the hood. It might be that casting incurs a copy even implictly

seberg · 2017-03-03T21:13:38Z

Depends a lot on the code paths taken, most involved ones end up doing much the same as copyto (or even call it for the view cases). Plus object arrays do get an extra copy in some cases. But no, normally there is no copy made. You could add indexing to copyto in principle. I am not quite sure I am convinced that this is the only right thing. OTOH, it is one of those traps learning numpy

njsmith · 2017-03-03T21:40:29Z

Ugh, I thought we had deprecated this. But @seberg's right, we seem to have gone all the way through the deprecation cycle for in-place operations, but missed simple assignment:

In [4]: a = np.arange(10)

In [5]: a += 0.5
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-633f1f7b6cb9> in <module>()
----> 1 a += 0.5

TypeError: Cannot cast ufunc add output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

In [6]: a[:] = 0.5

In [7]:

I agree with @charris et al that we should deprecate these unsafe conversions on assignments and eventually make them an error.

OliverEvans96 · 2017-03-09T16:59:47Z

I'm not sure whether this directly addresses the issue or if it's reasonable, but it just occured to me that there would be no ambiguity in my original situation if a=array([1,0]) defaulted to a float array. If that were the case, then somebody who intended to truncate the value they're saving in an int array would simply explicitly declare a=array([1,0],dtype=int64).

Is there a reason this is not presently the case? Would it break backwards-compatability or be significantly less efficient for most use cases?

charris · 2017-03-09T17:29:39Z

if a=array([1,0]) defaulted to a float array. If that were the case,

Terrible compatibility break. The default is to use the minimum suitable kind compatible with the given elements.

stevengj · 2022-03-24T15:45:01Z

FWIW, Julia has always thrown an error when you try to implicitly assign a non-integer value to an integer-typed location:

julia> a = [1,0]
2-element Vector{Int64}:
 1
 0

julia> a[1] = 1.1
ERROR: InexactError: Int64(1.1)

and I haven't heard anyone say that they would prefer silent truncation. (a[1] = 2.0 succeeds because the floating-point value 2.0 has an exact conversion to an integer type.)

Deprecating (emitting a warning) then throwing an error seems like a good route for numpy too.

seberg · 2024-01-18T21:37:23Z

Closing this, part of the discussion here predates gh-25621 by a long time, but it just seems like a question more anyway even back then.

eric-wieser added the 15 - Discussion label Mar 2, 2017

ahaldane mentioned this issue Nov 25, 2018

ENH: add multi-field assignment helpers in np.lib.recfunctions #11526

Merged

eric-wieser mentioned this issue Nov 25, 2018

Numpy silently assigns zeroes #12449

Closed

anuppari mentioned this issue Jun 28, 2019

Element assignment does not work for mixed datatypes #13817

Closed

seberg mentioned this issue Apr 14, 2021

default data-type allocation for arrays could be misleading and harmful #18624

Closed

stevengj mentioned this issue Mar 24, 2022

Deprecate/remove silent truncation on assignment of floats into an integer array #7730

Open

seberg mentioned this issue Dec 8, 2022

BUG: np.array unsafely clips/casts integers if dtype is given (Trac #1083) #1681

Closed

seberg closed this as not planned Won't fix, can't repro, duplicate, stale Jan 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected truncation when storing float to int array #8733

Unexpected truncation when storing float to int array #8733

OliverEvans96 commented Mar 2, 2017

eric-wieser commented Mar 2, 2017

OliverEvans96 commented Mar 2, 2017 via email

eric-wieser commented Mar 2, 2017 •

edited

Loading

OliverEvans96 commented Mar 2, 2017 via email

eric-wieser commented Mar 2, 2017 •

edited

Loading

OliverEvans96 commented Mar 3, 2017 via email

eric-wieser commented Mar 3, 2017

eric-wieser commented Mar 3, 2017

charris commented Mar 3, 2017

pv commented Mar 3, 2017 via email

eric-wieser commented Mar 3, 2017 •

edited

Loading

charris commented Mar 3, 2017

eric-wieser commented Mar 3, 2017

charris commented Mar 3, 2017

seberg commented Mar 3, 2017

eric-wieser commented Mar 3, 2017

OliverEvans96 commented Mar 3, 2017 via email

eric-wieser commented Mar 3, 2017

seberg commented Mar 3, 2017

njsmith commented Mar 3, 2017

OliverEvans96 commented Mar 9, 2017

charris commented Mar 9, 2017

stevengj commented Mar 24, 2022 •

edited

Loading

seberg commented Jan 18, 2024

Unexpected truncation when storing float to int array #8733

Unexpected truncation when storing float to int array #8733

Comments

OliverEvans96 commented Mar 2, 2017

Current behavior

Expected behavior

eric-wieser commented Mar 2, 2017

OliverEvans96 commented Mar 2, 2017 via email

eric-wieser commented Mar 2, 2017 • edited Loading

OliverEvans96 commented Mar 2, 2017 via email

eric-wieser commented Mar 2, 2017 • edited Loading

OliverEvans96 commented Mar 3, 2017 via email

eric-wieser commented Mar 3, 2017

eric-wieser commented Mar 3, 2017

charris commented Mar 3, 2017

pv commented Mar 3, 2017 via email

eric-wieser commented Mar 3, 2017 • edited Loading

charris commented Mar 3, 2017

eric-wieser commented Mar 3, 2017

charris commented Mar 3, 2017

seberg commented Mar 3, 2017

eric-wieser commented Mar 3, 2017

OliverEvans96 commented Mar 3, 2017 via email

eric-wieser commented Mar 3, 2017

seberg commented Mar 3, 2017

njsmith commented Mar 3, 2017

OliverEvans96 commented Mar 9, 2017

charris commented Mar 9, 2017

stevengj commented Mar 24, 2022 • edited Loading

seberg commented Jan 18, 2024

eric-wieser commented Mar 2, 2017 •

edited

Loading

eric-wieser commented Mar 2, 2017 •

edited

Loading

eric-wieser commented Mar 3, 2017 •

edited

Loading

stevengj commented Mar 24, 2022 •

edited

Loading