Skip to content

BUG: Array ufunc reduce out tuple #9106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 18, 2017

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented May 12, 2017

This ensures that the normal reduce, accumulate and reduceat methods can deal with getting out as a single-item tuple, which is needed to work with __array_ufunc__ (see #9105). It builds on #9104; can be done together or in sequence.

note: I think I'm right that with my refcounting (not doing anything), but someone who understands this better please check...

Closes #9102.

@mhvk
Copy link
Contributor Author

mhvk commented May 12, 2017

In case it isn't obvious, this bug also hits things that use .reduce in the background: without this, a.max(out=b) doesn't work either with __array_ufunc__.

if (out_obj != NULL && PyTuple_CheckExact(out_obj) &&
PyTuple_GET_SIZE(out_obj) == 1) {
out_obj = PyTuple_GET_ITEM(out_obj, 0);
}
Copy link
Member

@eric-wieser eric-wieser May 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to give an explicit error here if the output is a tuple of the wrong length - elsewhere, we have:

if (PyTuple_Check(value)) {
    if (PyTuple_GET_SIZE(value) != nout) {
        PyErr_SetString(PyExc_ValueError,
                "The 'out' tuple must have exactly "
                "one entry per ufunc output");
        goto fail;
    }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, done, except a TypeError, as in override.c, because the situation here seems most analogous to not given sufficient number of arguments to a function, in which case cpython also raises a TypeError (see first item of the post-merger to-do list in #8892).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree - this is analogous to having the wrong number of outputs to a function, ie tuple unpacking - so ValueError is correct

@@ -2041,6 +2048,8 @@ def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
return NotImplemented

if method == 'at':
if isinstance(inputs[0], A):
inputs[0].info = info
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theres some documentation that should be kept in sync with this, isn't there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR makes the code consistent with the documentation... Unless you think we should explicitly mention in the reduce, accumulate, reduceat documentation that a tuple of 1 is allowed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about the __array_ufunc__ example in the docs, which I thought was kept synced with one of the tests. Maybe it was not that one.

Yes, I think we should document that a tuple is allowed for out in those functions too.

@ngoldbaum
Copy link
Member

Just a comment: what does it mean for out to be anything but a single array? In other words, in what context would you pass in a tuple? Maybe for ufuncs like divmod that return tuples? It would help a lot to discuss this in the __array_ufunc__ docs. It might already be there and I'm missing it though.

@mhvk
Copy link
Contributor Author

mhvk commented May 12, 2017

@ngoldbaum - this is really just for consistency with the regular __call__ method, to ensure that __array_ufunc__ always gets the same out keyword argument (just like *inputs is a tuple even if there is only one input). And who knows, maybe someone at some point can think of a function with two inputs and two outputs other than divmod where repeated application makes sense....

Anyway, given your comment, I should clearly try to make it more explicit in the __array_ufunc__ documentation.

@eric-wieser
Copy link
Member

@ngoldbaum: Functions like divmod need multiple output arguments. As such, ufunc.__call__ accepts out as a tuple, with an array of each argument.

For simplicity of overrides, out=arr is promoted to out=(arr,) on entry to __array_ufunc__, so that both cases can be handled in the same way.

@eric-wieser
Copy link
Member

@mhvk: We should only allow a tuple to be passed by keyword, not position, for consistency with ufunc.__call__

@mhvk
Copy link
Contributor Author

mhvk commented May 12, 2017

OK, so the test should be higher up... Will do. And will change to ValueError here and in override.c.

By mistake, any arguments to ufunc.reduce, ufunc.accumulate,
and ufunc.reduceat that were None were removed, rather than just
removing the 'out' argument.  This is corrected here, with
tests added.
@mhvk mhvk force-pushed the array_ufunc_reduce_out_tuple branch from b84b8b5 to 20837e5 Compare May 12, 2017 15:12
@mhvk
Copy link
Contributor Author

mhvk commented May 12, 2017

OK, made the changes, including docs, and did the rebase. Haven't had much time to check whether things are all OK (though tests pass); have to run now, back in ~2 hours.

@charris
Copy link
Member

charris commented May 12, 2017

Any reason this should not go into 1.13? I'm going to do another release in a few days to take care the current crop of errors.

@charris charris modified the milestones: 1.13.0 release, 1.14.0 release May 12, 2017
must have a shape that the inputs broadcast to.
must have a shape that the inputs broadcast to. A tuple of arrays
(possible only as a keyword argument) must have length equal to the
number of outputs; unwanted outputs can be set to `None`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None doesn't mean "I don't want an output", just "you can allocate it yourself"

@mhvk mhvk force-pushed the array_ufunc_reduce_out_tuple branch from 20837e5 to c47cb21 Compare May 12, 2017 17:26
@mhvk
Copy link
Contributor Author

mhvk commented May 12, 2017

@eric-wieser - I think all is OK now. Thanks for the rapid and careful review!

@mhvk
Copy link
Contributor Author

mhvk commented May 12, 2017

p.s. Please check my logic for not touching any reference count in the additional check out out...

Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't checked the tests, but looks OK for refcounting

freshly-allocated array is returned.
freshly-allocated array is returned. A tuple of arrays
(possible only as a keyword argument) must have length equal to the
number of outputs; use `None` for outputs to be allocated by the ufunc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit wordy - can you combine "use None for outputs to be allocated by the ufunc" with "If not provided a freshly-allocated array is returned. "? Not provided and none are the same thing with the default documented as it is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

        A location into which the result is stored. If not provided or `None`,
        a freshly-allocated array is returned. A tuple (possible only as a
        keyword argument) must have length equal to the number of outputs.

/* if there is a tuple of 1 for `out` in kwds, unpack it */
if (kwds != NULL) {
PyObject *out_obj = PyDict_GetItem(kwds, npy_um_str_out);
if (out_obj != NULL && PyTuple_Check(out_obj)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want CheckExact? Note that PyTuple_GET_SIZE can fail for tuple subclasses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll change that, here and in the two other locations where out is treated exactly in the same way (once in this file, once in override.c).

@mhvk mhvk force-pushed the array_ufunc_reduce_out_tuple branch from c47cb21 to a7748c7 Compare May 12, 2017 18:20
@mhvk
Copy link
Contributor Author

mhvk commented May 12, 2017

All done?

@mhvk mhvk force-pushed the array_ufunc_reduce_out_tuple branch 2 times, most recently from 2020e50 to 0a34fff Compare May 12, 2017 19:36
Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. One last comment - I think we need a .. versionchanged:: 1.13.0 for the tuple version of out for reduce and friends, and possibly a release note

@charris charris changed the title Array ufunc reduce out tuple BUG: Array ufunc reduce out tuple May 13, 2017
@mhvk
Copy link
Contributor Author

mhvk commented May 14, 2017

I'm wondering if we really want to have a versionchanged line for every ufunc. This is there just so we can deal with __array_ufunc__ better; it seems overkill for something I would not recommend doing (and where do we stop -- with this PR, it is possible to do a.max(out=(b,)), but should we adjust all those docstrings too?).

@charris
Copy link
Member

charris commented May 14, 2017

The parameter entries are generated, so if you add it to one, you will get it for all whether you wish it or not ;) You can always say it is not recommended, but I don't think it makes much difference.

@charris
Copy link
Member

charris commented May 16, 2017

I think a .. versionchanged:: would be appropriate for the three affected methods in add_newdocs.py. As the "new" functionality already worked for __call__ in 1.12, I don't think it is needed for that. This is almost a bug fix in that regard. Something in the release notes would also be good if this is not already covered.

@eric-wieser
Copy link
Member

if we really want to have a versionchanged line for every ufunc.

The only documentation that needs to change is for ufunc.reduce etc, not for each function like np.add.reduce - there are no such documentation pages

@mhvk mhvk force-pushed the array_ufunc_reduce_out_tuple branch 3 times, most recently from 445ed18 to 15ed128 Compare May 17, 2017 14:42
:exc:`ValueError`. The reduce-like methods all take an *axis* keyword
and a *dtype* keyword, and the arrays must all have dimension >= 1.
:exc:`ValueError`. The reduce-like methods all take an *axis* keyword, a *dtype*
keyword, and an *out* keyword, and the arrays must all have dimension >= 1.
The *axis* keyword specifies the axis of the array over which the reduction
will take place and may be negative, but must be an integer. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is outdated I think, tuples of ints also work. Not a problem for this, but needs fixing at some point. I'll open an issue.

@charris
Copy link
Member

charris commented May 17, 2017

@eric-wieser Look good to you?

out : ndarray, None, or tuple of ndarray and None, optional
A location into which the result is stored. If not provided or `None`,
a freshly-allocated array is returned. A tuple (possible only as a
keyword argument) must have length equal to the number of outputs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be made obvious here that the number of outputs is always 1 for reduce etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"all ufuncs implemented in numpy" doesn't cover it. This is a property of ufunc.reduce, so even third party ufuncs will conform to this.

I'd probably just go with for consistency with ufunc.call, a tuple of length 1 is also allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid iterating yet more on a change I didn't feel was really necessary to make (as it is just an implementation detail to help __array_ufunc__), could you give the exact text that you suggest?

Copy link
Member

@eric-wieser eric-wieser May 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - iterating here is far easier than iterating and rebasing. Apologies for not thinking of that earlier.

How about, for reduce, at, and reduceat (modulo line wrapping):

out : ndarray, None, or tuple of ndarray and None, optional
    A location into which the result is stored. If not provided or `None`,
    a freshly-allocated array is returned. For consistency with :ref:`ufunc.__call__`,
    this may be wrapped in a 1-element tuple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks. Now I didn't follow this pattern, but did make a change, to add "if given as a keyword" (since we don't check for it if given as a positional element)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, my mistake on forgetting that

@ngoldbaum
Copy link
Member

Is there going to be a numpy 1.13rc2? If not, I'd like to ask for one. I've been waiting on this going in to further experiment with __array_ufunc__.

@mhvk mhvk force-pushed the array_ufunc_reduce_out_tuple branch from 15ed128 to 9785011 Compare May 17, 2017 20:54
@mhvk
Copy link
Contributor Author

mhvk commented May 17, 2017

@ngoldbaum - yes, I agree we should have a second release candidate. @charris?

@charris
Copy link
Member

charris commented May 17, 2017

@mhvk Planning on it and would like to get it out this week. I started with rc1 because few look at the betas ...

@mhvk mhvk force-pushed the array_ufunc_reduce_out_tuple branch from 9785011 to ca49f0b Compare May 18, 2017 01:13
int all_none = 1;

if (PyTuple_GET_SIZE(out) != nout) {
PyErr_Format(PyExc_TypeError,
PyErr_Format(PyExc_ValueError,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a release note - we already have a section for changed exceptions

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one has actually not been released yet (it is used in __array_ufunc__), and it now becomes consistent with the regular ufuncs, so I think it is OK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right - testing ufunc.__call__ on 1.12 gives me ValueError, as you have here.

Copy link
Member

@eric-wieser eric-wieser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eric-wieser
Copy link
Member

Will merge tomorrow, along with the backport PR I set up at #9111

@eric-wieser eric-wieser merged commit 1ec9ad6 into numpy:master May 18, 2017
@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label May 18, 2017
@charris charris removed this from the 1.13.0 release milestone May 18, 2017
@mhvk mhvk deleted the array_ufunc_reduce_out_tuple branch August 31, 2017 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants