Skip to content

Raise a warning when assigning to a temporary (fix described, patch needed) #2802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
erg opened this issue Dec 10, 2012 · 10 comments
Open

Comments

@erg
Copy link

erg commented Dec 10, 2012

It would be nice if numpy raised a warning when people assigned into temporary arrays, since it's rather easy to end up with these accidentally and not realize what's going on, so people keep barking their shins on this. The original report below is one example; another would be

a.ravel()[:] = [...]

which will usually modify a... but sometimes, depending on a's memory layout, it may be a no-op instead. Quite sneaky.

The way to do this would be to write a function like PyArray_DefinitelyTemporary(arr) that returned true iff all entries in arr's .base chain fulfilled the properties:

  • is an ndarray
  • with reference count exactly 1

Then in the Python entry points for numpy assignments -- np.copyto, __setitem__, ufunc out= parameters, etc. -- have some code like

if (PyArray_DefinitelyTemporary(in_arr)) {
  warn("assignment to temporary value has no effect!");
}

(NB: I say Python entry points above, because any numpy C API entry points can legitimately take borrowed references to non-temporary arrays that will have reference count 1. So we need to be careful that these checks are done before we hit code that could be reached by C API users.)


Original report:

I know that I should be using slice notation instead of np.copyto, but I also feel like case #1 should throw an error.

The case that works:

import numpy as np
emp = np.empty(3)
# emp is array([  1.72723371e-077,   1.72723371e-077,   2.19630579e-314])
np.copyto(emp, np.arange(3))
# emp is array([ 0.,  1.,  2.])  # great, just like I expect


Case #1:

import numpy as np
emp = np.empty(3)
# emp is array([  1.72723371e-077,   1.72723371e-077,   2.19630579e-314])
np.copyto(emp+1, np.arange(3))  # want an error here, acts like no-op or worse
# emp is array([  1.72723371e-077,   1.72723371e-077,   2.19630579e-314])
@njsmith
Copy link
Member

njsmith commented Dec 10, 2012

Writing something like arr+1 never has anything to do with offsets in
numpy. + always means addition, and what's happening here is that emp+1
is adding 1 to each of the values in the emp array, and storing the
results in a temporary array. Then this temporary array gets overwritten by
the call to copyto, and then its reference count drops to zero and it
gets discarded.

Compare to

new_arr = emp + 1
np.copyto(new_arr, np.arange(3))

This code is exactly the same as far as numpy can tell - the same
operations are performed on the same arrays - but it makes perfect sense
now.

The only thing I can think of is, we could teach np.copyto to issue a
warning "assignment to temporary value" if its argument has a reference
count of 1, meaning it will be deallocated as soon as copyto returns? I
suppose that would be useful. Such assignments aren't necessarily harmful,
but they're never useful.

It looks like we could do the same for regular slice assignment too. I was
worried about the temporaries that python sometimes generates for in-place
calls triggering the warning accidentally, but it looks like the python
interpreter holds references to those, just like if they were real
variables. (As it has to I guess, to avoid the generic func call code
destroying the temporaries before returning.)
On 10 Dec 2012 00:05, "erg" notifications@github.com wrote:

I know that I should be using slice notation instead of np.copyto, but I
also feel like case #1 #1 should
throw an error.

The case that works:

import numpy as np
emp = np.empty(3)

emp is array([ 1.72723371e-077, 1.72723371e-077, 2.19630579e-314])

np.copyto(emp, np.arange(3))

emp is array([ 0., 1., 2.]) # great, just like I expect

Case #1:

import numpy as np
emp = np.empty(3)

emp is array([ 1.72723371e-077, 1.72723371e-077, 2.19630579e-314])

np.copyto(emp+1, np.arange(3)) # want an error here, acts like no-op or worse

emp is array([ 1.72723371e-077, 1.72723371e-077, 2.19630579e-314])


Reply to this email directly or view it on GitHubhttps://github.com//issues/2802.

@njsmith
Copy link
Member

njsmith commented Dec 10, 2012

Of course, views complicate this picture... At a minimum we should check
that the array owns its data before issuing any warnings. A more clever
implementation would check the reference count on .base as well, and issue
a warning for arrays which didn't own their data but for which all the
reference counts on the .base chain were also 1.
On 10 Dec 2012 00:57, njs@pobox.com wrote:

Writing something like arr+1 never has anything to do with offsets in
numpy. + always means addition, and what's happening here is that emp+1
is adding 1 to each of the values in the emp array, and storing the
results in a temporary array. Then this temporary array gets overwritten by
the call to copyto, and then its reference count drops to zero and it
gets discarded.

Compare to

new_arr = emp + 1
np.copyto(new_arr, np.arange(3))

This code is exactly the same as far as numpy can tell - the same
operations are performed on the same arrays - but it makes perfect sense
now.

The only thing I can think of is, we could teach np.copyto to issue a
warning "assignment to temporary value" if its argument has a reference
count of 1, meaning it will be deallocated as soon as copyto returns? I
suppose that would be useful. Such assignments aren't necessarily harmful,
but they're never useful.

It looks like we could do the same for regular slice assignment too. I was
worried about the temporaries that python sometimes generates for in-place
calls triggering the warning accidentally, but it looks like the python
interpreter holds references to those, just like if they were real
variables. (As it has to I guess, to avoid the generic func call code
destroying the temporaries before returning.)
On 10 Dec 2012 00:05, "erg" notifications@github.com wrote:

I know that I should be using slice notation instead of np.copyto, but I
also feel like case #1 #1 should
throw an error.

The case that works:

import numpy as np
emp = np.empty(3)

emp is array([ 1.72723371e-077, 1.72723371e-077, 2.19630579e-314])

np.copyto(emp, np.arange(3))

emp is array([ 0., 1., 2.]) # great, just like I expect

Case #1:

import numpy as np
emp = np.empty(3)

emp is array([ 1.72723371e-077, 1.72723371e-077, 2.19630579e-314])

np.copyto(emp+1, np.arange(3)) # want an error here, acts like no-op or worse

emp is array([ 1.72723371e-077, 1.72723371e-077, 2.19630579e-314])


Reply to this email directly or view it on GitHubhttps://github.com//issues/2802.

@erg
Copy link
Author

erg commented Dec 10, 2012

I forgot about slicing notation and was trying to find an api call to make it work and got confused. Maybe the docs for np.copyto should mention slice notation as an alternative or maybe I just need to step up my python game... Making this syntax throw an error does seem like a hard problem and probably isn't worth the effort. I can close this issue if you want.

@njsmith
Copy link
Member

njsmith commented Dec 10, 2012

Well, I think there is a real issue here, so I edited the bug description to describe a more general solution; perhaps someone will see it and be inspired to fix it :-). (The new description also has another plausible example of how this same basic issue can bite people in nasty ways, even if they aren't otherwise confused. And anyway I like software that I can successfully use even when I'm confused, since everyone gets confused sometimes...)

@seberg
Copy link
Member

seberg commented Dec 14, 2012

Just to note, the functions I can think of right now that could use this warning are:

  • array assignments as in njsmith's example
  • copyto and its friends: putmask, put and place
  • array.sort() method
  • np.random.shuffle

Since most functions in numpy (ie. ufuncs) return the output array, no warning is possible there.

@njsmith
Copy link
Member

njsmith commented Dec 14, 2012

Doh, of course, that's a good point about out= arguments. And it'd be nice
if we could avoid adding this quirky refcount-checking function to the
public multiarray API, so if the ufunc library doesn't need it then that's
nice...

Still trying to work out in my mind how this actually works. Like you say,
it's possible to pass a temporary as an out= argument and have it mean
something, but it still seems rather odd, yes? You need something like:

c = np.add(a, b, out=np.empty(10))

I'm not sure why someone would write this; is there any situation where
it's useful? it's exactly equivalent to

c = np.add(a, b, dtype=np.float64)

so arguably a warning is still a good idea. But it might still be a useful
construct for auto-generated code or something like that for some reason
I'm not quite seeing right now?

On Fri, Dec 14, 2012 at 1:50 PM, seberg notifications@github.com wrote:

Just to note, the functions I can think of right now that could use this
warning are:

  • array assignments as in njsmith's example
  • copyto and its friends: putmask, put and place
  • array.sort() method
  • np.random.shuffle

Since most functions in numpy (ie. ufuncs) return the output array, no
warning is possible there.


Reply to this email directly or view it on GitHubhttps://github.com//issues/2802#issuecomment-11376744.

@seberg
Copy link
Member

seberg commented Dec 14, 2012

@njsmith didn't think enough about it maybe. It seems to me you are right, unless you want some very quirky memory order of the result (that cannot be specified with just order), I cannot see any reason to use that design, too. So it is probably so rare that it is actually wanted behaviour that numpy could warn anyway. Maybe the only way to find out is to see if people complain when you add warning to a development version :).

@mattip
Copy link
Member

mattip commented Feb 25, 2019

Can we close this? It seems there are edge cases to the edge cases in the logic necessary to detect assigning to temporary variables

@rgommers
Copy link
Member

rgommers commented May 4, 2020

No further response, closing. Feel free to reopen, but seems like this is not a useful issue anymore.

@rgommers rgommers closed this as completed May 4, 2020
@eric-wieser
Copy link
Member

eric-wieser commented May 4, 2020

This still strikes me as a handy feature to implement - and since we were able to implement elision of temporaries using refcounts, it certainly seems possible that we could pull off the same trick here to emit warnings.

I'd argue we should reopen this, but it was certainly not a good first issue.

@rgommers rgommers reopened this May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants