Skip to content

ENH: Remove NpyIter_Close #11376

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 20, 2018
Merged

ENH: Remove NpyIter_Close #11376

merged 2 commits into from
Jun 20, 2018

Conversation

mattip
Copy link
Member

@mattip mattip commented Jun 18, 2018

Extends PR #9998, replaces PR #11370.

PR #9998 introduced NpyIter_Close as part of the public C-API in order to resolve any WRITEBACK_IF_COPY semantics after iterator use. This PR removes it entirely, instead resolving writeback semantics in NpyIter_Deallocate.

Changes:

  • Extend NpyIter_Deallocate to include writeback resolution if needed via the operand-specific flag
  • Remove NpyIter_Close, adjust the c-api headers, documentation, and internal code.
  • Extend nditer_deallocate which is called as tp_dealloc for the python np.nditer to check if writeback resolution has occurred (via a context manager __exit__ or a direct call to np.close), and warn appropriately. This part requires introspection into the NpyIter. I created a non-public function to do so, is there a better way?
  • Remove a c-extension test for warnings if using NpyIter_Deallocate without NpyIter_Close
  • Repurpose a c-extension test to pure python since the warning moved levels

If this is accepted it should be in 1.15 since it reverts expansion of the C-API for this version

@mattip mattip force-pushed the remove-npyiter_close branch from 07cb10e to 82bc85d Compare June 18, 2018 21:26
@seberg
Copy link
Member

seberg commented Jun 18, 2018

Sounds like a good thing if it has no downside? How come NpyIter_Dealloc is enough after all?

@mattip
Copy link
Member Author

mattip commented Jun 18, 2018

How come NpyIter_Dealloc is enough after all?

When I was starting #9998, NpyIter_Close seemed like the heart of the pull request, it seems I was stuck in a preconception that @eric-wieser saw around.

The only downside I can see is that introspection during npyiter_dealloc requires access to "private" NpyIter flags, I had to create a funcion nditer_has_writeback that is not part of the public NpyIter API but that nditer_pywrap.c has access to.

@@ -2833,12 +2816,11 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,

NPY_UF_DBG_PRINT("Returning Success\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really accurate any more

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

"operands. Typically nditer is used as a context manager "
"otherwise 'close' must be called before reading iteration "
"results.", 1) < 0) {
PyObject * s;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: normally spaced as PyObject *s;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

/* If NPY_OP_ITFLAG_HAS_WRITEBACK flag set on operand, resolve it.
* If the resolution fails (should never happen), continue from the
* next operand and discard the writeback scratch buffers, and return
* failure status
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment doesn't appear to reflect reality

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy-pasted comment removed

* Returns 0 on success, -1 on failure
*/
NPY_NO_EXPORT int
NpyIter_Close(NpyIter *iter)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the code below need to move into NpyIter_Dealloc?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it now lives inside NpyIter_Deallc. Comment expanded to emphasize that.

@charris
Copy link
Member

charris commented Jun 19, 2018

@eric-wieser Look good to you?

@mattip
Copy link
Member Author

mattip commented Jun 19, 2018

@charris please double check my revert of the C-API macro changes.

@ahaldane
Copy link
Member

As I recall the problem with moving the writeback to Npy_Dealloc had something to do with NpyIter_Copy. Is that still a problem? Here's a comment from last time: #9998 (comment). I think what happens with this PR is that the first copy to be deallocated will trigger writeback, and then any subsequent copies to deallocate will do nothing. That's what happened before we introduced the context manager, right? In that case no-one's code should break.

So, I think I agree we can get rid of NpyIter_Close in the C-api here. Nice find!

"Iterator is closed");
return NULL;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for removing all these is_closed checks? It is related to #10956, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a check for self->iter above each of these code blocks. It will now be NULL, so the checks and the flag have been removed

@mattip
Copy link
Member Author

mattip commented Jun 19, 2018

the first copy to be deallocated will trigger writeback, and then any subsequent copies to deallocate will do nothing

@ahaldane yes, it is safe to call PyArray_ResolveWritebackIfCopy multiple times

Copy link
Member

@ahaldane ahaldane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, that makes sense, and it makes sense that we can merge NpyIter_Close and NpyIter_Deallocate since we don't have to worry about garbage collection delays in the c-api.

LGTM, it's a nice cleanup.

@@ -2831,14 +2814,13 @@ PyUFunc_GeneralizedFunction(PyUFuncObject *ufunc,
Py_XDECREF(full_args.in);
Py_XDECREF(full_args.out);

NPY_UF_DBG_PRINT("Returning Success\n");
NPY_UF_DBG_PRINT("Returning success code 0\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like It may not always be 0, see L2801 above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@charris
Copy link
Member

charris commented Jun 20, 2018

I just made a quick scan over it. What macros did you want me to look at?

EDIT: The version stuff looked OK to me. Hmm, but there is a missing line in numpyconfig.h for 1.15 unrelated to this. Needs

#define NPY_1_15_API_VERSION 0x00000008

I'll take care of that elsewhere.

@mattip
Copy link
Member Author

mattip commented Jun 20, 2018

I just made a quick scan over it. What macros did you want me to look at?

@charris the ones in numpyconfig.h

@mattip mattip force-pushed the remove-npyiter_close branch from 52af056 to fe268cc Compare June 20, 2018 05:49
@mattip mattip force-pushed the remove-npyiter_close branch from fe268cc to 7349468 Compare June 20, 2018 05:55
@@ -2830,4 +2789,23 @@ npyiter_checkreducesize(NpyIter *iter, npy_intp count,
}
return count * (*reduce_innersize);
}

NPY_NO_EXPORT npy_bool
npyiter_has_writeback(NpyIter *iter)
Copy link
Member

@eric-wieser eric-wieser Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this function, which traverses the flags a second time - can you just add an extra return value from NpyIter_Dealloc?

Perhaps most simple would be to rename it to

// internal helper
NpyIter_Dealloc_int(NpyIter *iter, npy_bool *did_writeback) {
    // as before, but set did_writeback
}
/* NUMPY_API */
NpyIter_Dealloc(NpyIter *iter) {
    npy_bool did_writeback;
    return NpyIter_Dealloc_int(iter, &did_writeback);
}

Or perhaps even NpyIter_Dealloc_int(iter, warn_if_writeback_needed=False), and have it raise the warning itself

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had considered that, but this API seems cleaner to me. Only the rarely used python nditer needs the warning, where the C NpyIter is used very frequently. The more commonly used path should be as clean as possible. A modified NpyIter_Dealloc would go through a second function call and would contain extra logic even for the fast path, where for nditer the extra call and flag traversal should be a small part of the GC cleanup when tp_dealloc is called.

We could revisit this idea in the future when the dust has settled around nditer use, it is an internal refactoring of the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fear with the current approach is that it separates the check of whether cleanup is needed from the actual code doing the cleanup.

a second function call

It wouldn't surprise me if the compiler just inlined the outer function - assuming that we're using the internal function definition, rather than the external function pointer.

We could revisit this idea in the future

True, this PR as is is already a nice improvement

nop = NIT_NOP(iter);
op_itflags = NIT_OPITFLAGS(iter);

for (iop=0; iop<nop; iop++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for future reference: should be spaces a round = and <. Not going to make you fixup this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be fixed up online before merge.

for(iop = 0; iop < nop; ++iop, ++dtype, ++object) {
if (op_itflags[iop] & NPY_OP_ITFLAG_HAS_WRITEBACK) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency with how dtype and object are used below, you could use *op_itflags and increment it along with the others.

Probably not worth it though - if anything I find your approach clearer.

"Temporary data has not been written back to one of the "
"operands. Typically nditer is used as a context manager "
"otherwise 'close' must be called before reading iteration "
"results.", 1) < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a solution to #10956 yet? Is it guaranteed that this warning will never appear if out is allocated by the iterator?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattip This seems to be the only outstanding issue here that needs to be resolved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure #10956 is an issue. See my last comment there. It is more a discussion about what is meant by operands. In any case, if out is allocated by the iterator it will never have writeback semantics and so the warning will not appear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, if out is allocated by the iterator it will never have writeback semantics and so the warning will not appear.

Great, this is what I was worried about

static void
npyiter_dealloc(NewNpyArrayIterObject *self)
{
if (self->iter) {
if (npyiter_has_writeback(self->iter)) {
Copy link
Member

@eric-wieser eric-wieser Jun 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this would move to tp_finalize, once we drop 2.7. Then you could also just pass npiter.__del__ in as the argument to WriteUnraiseable.

We'd use the PyObject_CallFinalizerFromDealloc function, which is undocumented (https://bugs.python.org/issue33909)

@charris
Copy link
Member

charris commented Jun 20, 2018

I'm going to put this in for the 1.15.0rc1 release. It will be a month or so before the final, so time to polish this a bit more if desired.

@charris charris merged commit e110f93 into numpy:master Jun 20, 2018
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Jun 20, 2018
@charris charris removed this from the 1.15.0 release milestone Jun 20, 2018
@eric-wieser
Copy link
Member

@mattip: NpyIter_Close is still mentioned in the release notes

@mattip mattip deleted the remove-npyiter_close branch June 21, 2018 17:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants