-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BUG: it.close() disallows access to iterator, fixes #10950 #10951
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -2847,7 +2847,7 @@ def test_writebacks(): | |||
enter = it.__enter__ | |||
assert_raises(ValueError, enter) | |||
|
|||
def test_close(): | |||
def test_close_equivalent(): | |||
''' using a context amanger and using nditer.close are equivalent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
amanger -> manager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
assert_equal (next(it), 0) | ||
it.close() | ||
assert_raises(StopIteration, next, it) | ||
assert_raises(ValueError, getattr, it, 'operands') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we maybe allow access to the operands even after .close()
?
Maybe I'm forgetting how this all works, but I think the operands are still safe to use after closing. It is the temp buffers which are unsafe.
In fact, isn't it the case that the output operand should only be used by other code after close is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to make explicit that it.operands
can no longer be trusted, it may retrieve stale data from a closed iterator. I will add documentation to that effect, but am open to other opinons
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with @ahaldane on this one. Doesn't operands
point to the array to write back to, in which case it's not even valid until after the it.close()
is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to document the behavour wrt writeback semantics here
If writeback semantics were used and have been resolved, the operand's data is no longer related to the input array's data. If writeback semantics were not used, the operand shares data with the input array. So we could create a situation where code is fragile: it works without writeback semantics but fails when they are used.
Note this was true even before PR #9998. The previous trigger to resolve was a call to NpyIter_Deallocate
, which was done by del it
or it = None
. Since it
was no longer valid, new access to the operands was prevented.
We started discussing this in the nditer.close pull request in Dec, but I guess it never really got resolved.
I am open to suggestions. The motivation/root-cause of this whole mess is that the operand can have a different, non-compatible dtype from the array. At some point that incompatibility needs to be resolved, and from that point forward the operand is unusable or we need some kind of flag to track between the (now writeable) data in the array and the still-valid(?), writeable operand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, you're right, I was misunderstanding it. The it.operands
are indeed the temporary buffers, and not the original output array, and so should not be accessed after closing. Eg,
>>> a = np.arange(24, dtype='f8')
>>> aiter = np.nditer(a, [], [['readwrite', 'updateifcopy']],
... casting='same_kind',op_dtypes=[np.dtype('f4')])
>>> with aiter as it:
... x = it.operands[0]
>>> x[0] = 10
>>> a[0]
0.0
But in that case all the examples above which do things like return it.operands[1]
don't make sense, since they appear to be returning the temp buffer rather than the original array. We need to provide a way for the user to access the allocated output arrays even after the iterator is closed.
Perhaps it.operators
should point to the original array instead of the temp buffer? Or (more likely) maybe we need to add a new attribute to the nditer object which points to the original arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The nditer docstring says:
operands | (tuple of operand(s)) The array(s) to be iterated over.
which sounds to me like it is describing the "original" arrays, not the buffer copies. All of the code which does return it.operands[1]
seems to implicitly assume it.operands
refers to the original arrays. ... except we just found this isn't true.
The one exception is in the reduction examples, which start by doing it.operands[1][..] = 0
to initialize the reduction, which assumes that it.operands
refers to the buffers. However, there seems to be a whole complicated discussion of a special flag delay_bufalloc
and a special method it.reset()
which, looking at it in retrospect, seem to me to be hacks only necessary because of the lack of distinction between the "original" arrays and the buffer arrays which we have discovered above.
Maybe now that we have a context manager, we can deprecate all of that and add some better behavior? That is, you would do it.operands[1][..] = 0
before the context is entered, (so context entrance replaces it.reset
)?
Maybe that's all going too far, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the examples should do
- return it.operands[1]
+ if out:
+ return out
+ else:
+ return it.operand[1]
As for delay_bufalloc
it relates to the buffered
flag, which allocates memory independent of the operands
.
@@ -20,7 +20,7 @@ | |||
|
|||
typedef struct NewNpyArrayIterObject_tag NewNpyArrayIterObject; | |||
|
|||
enum NPYITER_CONTEXT {CONTEXT_NOTENTERED, CONTEXT_INSIDE, CONTEXT_EXITED}; | |||
enum NPYITER_CONTEXT {CONTEXT_NOTENTERED, CONTEXT_INSIDE, CLOSED}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the distinction between CONTEXT_NOTENTERED
and CONTEXT_INSIDE
even used anywhere? I feel like npy_bool is_closed
might be all that is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adopted
@@ -1517,7 +1517,8 @@ static PyObject *npyiter_itviews_get(NewNpyArrayIterObject *self) | |||
static PyObject * | |||
npyiter_next(NewNpyArrayIterObject *self) | |||
{ | |||
if (self->iter == NULL || self->iternext == NULL || self->finished) { | |||
if (self->iter == NULL || self->iternext == NULL || | |||
self->finished || (self->managed == CLOSED)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would drop the inner parens here - they're inconsistent with the rest of the expression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is strict improvement over master, and the .operands
discussion was something we should have folded into the original issue.
So perhaps we should merge this, and open a new issue about how to handle it.operands
?
Fair enough, let's do that. I'll go ahead with the merge. Thanks Matti! |
PR #9998 added a
close
method tonditer
that resolved writeback semantics but did not actually disallow access to the iterator operands