Skip to content

BUG __numpy_ufunc__ gives unexpected TypeError with subclasses #4815

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented Jun 19, 2014

I'm trying to use __numpy_ufunc__ for astropy Quantity and found surprising difficulties with subclasses, where <Q-subclass>+<Q> yields a TypeError while <Q>+<Q-subclass> works (see below for a simple example using just ndarray subclasses). This behaviour seems surprising. Am I missing something?

import numpy as np
np.version.version
# '1.9.0.dev-133d4f4'
class MyA(np.ndarray):                                
    def __numpy_ufunc__(self, ufunc, method, i, inputs, **kwargs):
        return getattr(ufunc, method)(*(input.view(np.ndarray) for input in inputs), **kwargs)

class MyA2(MyA):                                      
    pass

np.arange(2.).view(MyA) + np.arange(2.).view(MyA2)
# array([ 0.,  2.])
np.arange(2.).view(MyA2) + np.arange(2.).view(MyA)
# TypeError: unsupported operand type(s) for +: 'MyA2' and 'MyA'

@juliantaylor juliantaylor added this to the 1.9 blockers milestone Jun 19, 2014
@pv
Copy link
Member

pv commented Jun 19, 2014

This is due to the behavior exception in ndarray.__add__ (number.c:array_add) that it tries to yield execution to __radd__
rather than calling the ufunc (so that __numpy_ufunc__ is never
invoked). This occurs when (i) the RHS op is not a strict subclass of
LHS, and (ii) the RHS implements both __radd__ and __numpy_ufunc__.
*
The problem comes from that ndarray.__radd__ is implemented C, and the
slot works so that it just swaps the arguments, so that the call chain
in b+a is:
*

  • MyA2.add(b, a): implemented in superclass, ndarray.add
  • ndarray.add(b, a): RHS is not subclass of LHS and implements __radd__, so it returns NotImplemented in order to defer to it
  • CPython sees the NotImplemented, and calls radd instead
  • MyA.radd(a, b): implemented in superclass, ndarray.radd
  • ndarray.radd(a, b): CPython implements this by calling ndarray.__add__(b, a)
  • ndarray.add(b, a): NotImplemented again, as above
  • Since both calls deferred, CPython correctly decides that addition is not implemented
    *
    It's not possible to write different code for ndarray.__radd__ in C (at least via the slot).
    I'm not sure if the whole idea of deferring to RHS is broken by this constraint.
    *
    A workaround is to implement the RHS binop methods yourself in the
    parent class via def __radd__(self, other): return self.view(parentcls).__radd__(other).
    *
    This is again circling the issue that it is unclear how one is supposed
    to implement the binops within the Python RHS/LHS mechanism, so that
    things work out as expected when interacting with subclasses and
    non-subclasses. I'm becoming less convinced that the problem is solvable...

@mhvk
Copy link
Contributor Author

mhvk commented Jun 19, 2014

@pv - that seems troublesome! I guess the reason for the trouble in part seems to be that the code thinks MyA and MyA2 define __radd__ even though they do not actually define this, they just take it from the superclass. In a straight CPython example where I define the reverse function in a superclass, it is properly ignored:

class MyF(float):
    def __sub__(self, other):
        print("mysub!")
    def __rsub__(self, other):
        print("myrsub!")

class MyF2(MyF):
    pass

MyF(2.) - MyF2(1.)
# mysub!
MyF2(2.) - MyF(1.)
# mysub!

Could the problem be that in numbers.c, what is done is PyObject_HasAttrString(other, right_name), which would seem to be the equivalent of hasattr(type(other), '__rsub__') when it should perhaps be the equivalent of something like '__rsub__' in type(other).__dict__? At least for the CPython example, that would work:

In [23]: hasattr(MyF, '__rsub__')
Out[23]: True

In [24]: hasattr(MyF2, '__rsub__')
Out[24]: True

In [25]: '__rsub__' in MyF.__dict__
Out[25]: True

In [26]: '__rsub__' in MyF2.__dict__
Out[26]: False

EDIT: the same holds for the objects rather than the class: hasattr(MyF2(), '__rsub__') is True while '__rsub__' in MyF2().__class__.__dict__ is False

@pv
Copy link
Member

pv commented Jun 19, 2014

The problem here is that ndarray.__add__ does not know whether it is __add__ or __radd__, as both go through the same tp_add slot in CPython.

Checking for __radd__ in __dict__ does not solve the issue as __radd__ could also be defined in a non-ndarray parent class.

However, checking for type(other).__radd__ is not np.ndarray.__radd__ would in principle one workaround, i.e. making it equivalent to

class ndarray(object:)
    def __add__(self, other):
        if (issubclass(self, ndarray) and
            type(other) is not ndarray and
            not issubclass(other, self.__class__) and
            hasattr(other, '__radd__') and
            other.__radd__ is not ndarray.__radd__):
            # defer to __radd__
            return NotImplemented
        return np.add(self, other) # call ufunc

    def __radd__(self, other):
        # CPython slots are like this
        return ndarray.__add__(other, self)

Note that apart from the __radd__ = __add__ issue which is peculiar to the C implementation, this is essentially a pure-Python problem in designing an appropriately co-operative parent class binop routine. (The problem is not related to __numpy_ufunc__.) I don't see a way to avoid ugly hacks such as above while simultaneously preserving "expected" properties.

NB. there are a number of other related open issues, which can give perspective on what people regard as "expected" behavior. Implementation such as class ndarray: def __add__(self, other): return np.add(self, other) causes "surprising" binop behavior.

@mhvk
Copy link
Contributor Author

mhvk commented Jun 19, 2014

Hmm, I fear my example may not be that great -- one might have to traverse the whole MRO to see whether the reverse operation is defined in a subclass. One possible way out would be to check whether the reverse operation of the RHS refers to the same function as that on the LHS, i.e., only hand over to __radd__ when MyA2.__radd__ is not MyA.__radd__.

... OK, I see you reached the same conclusion. I guess the question is whether one could do the equivalent of the type(other).__radd__ is not type(self).__radd__ inside numbers.c. It would seem to match what CPython does (I just checked and while one can override the operator on an object, it will not actually get used).

@njsmith
Copy link
Member

njsmith commented Jun 19, 2014

Do you remember the details of why we added the special case dispatch to
array_add? Why doesn't array_add just dispatch to np.add, which will then
follow the numpy_ufunc dispatch rules (which will give both objects a
chance to handle the operation, as desired).
On 19 Jun 2014 12:19, "Pauli Virtanen" notifications@github.com wrote:

This is due to the behavior exception in ndarray.__add__ (number.c:array_add) that it tries to yield execution to __radd__
rather than calling the ufunc (so that __numpy_ufunc__ is never
invoked). This occurs when (i) the RHS op is not a strict subclass of
LHS, and (ii) the RHS implements both __radd__ and __numpy_ufunc__.

The problem comes from that ndarray.__radd__ is implemented C, and the
slot works so that it just swaps the arguments, so that the call chain
in b+a is:

  • MyA2.add(b, a): implemented in superclass, ndarray.add
  • ndarray.add(b, a): RHS is not subclass of LHS and implements
    __radd__, so it returns NotImplemented to defer to it
  • MyA.radd(a, b): implemented in superclass, ndarray.radd
  • ndarray.radd(a, b): CPython implements this by calling
    ndarray.__add__(b, a)
  • ndarray.add(b, a): NotImplemented again, as above
  • Since both calls deferred, CPython correctly decides that addition is
    not implemented

It's not possible to write different code for ndarray.__radd__ in C.
I'm not sure if the whole idea of deferring to RHS is broken by this
constraint.

A workaround is to implement the RHS binop methods yourself in the
parent class via def __radd__(self, other): return self.view(np.ndarray).__radd__(other).

This is again circling the issue that it is unclear how one is supposed
to implement the binops within the Python RHS/LHS mechanism, so that
things work out as expected when interacting with subclasses and
non-subclasses. I'm becoming less convinced that the problem is
solvable...


Reply to this email directly or view it on GitHub
#4815 (comment).

@mhvk
Copy link
Contributor Author

mhvk commented Jun 19, 2014

Actually, if I understand correctly, the whole thing is needed in the first place to ensure that if the RHS has __numpy_ufunc__, that will get called even if CPython has not given it priority already. Is this correct? But what if the LHS also defines __numpy_ufunc__ (as in my example)? Then, should it not have precedence?
Certainly, the specific issue above goes away if I change:

@@ -135,7 +135,7 @@ needs_right_binop_forward(PyObject *self, PyObject *other,
          */
         return 0;
     }
-    if (has_ufunc_attr(other) &&
+    if (!has_ufunc_attr(self) && has_ufunc_attr(other) &&
         PyObject_HasAttrString(other, right_name)) {
         return 1;
     }

@pv
Copy link
Member

pv commented Jun 19, 2014

@njsmith: yes, if we feel that __numpy_ufunc__ should have priority over the Python binop mechanism, that is a possible --- and simpler --- design route. I think I'm backing out from my opposition to this, in light of the hacks that avoiding it seems to require.

I don't immediately recall what will break if this mechanism is simply ripped out and my PR gh-3856 is reverted. Potential points of failure: (i) operations with classes that don't define __numpy_ufunc__, (ii) interference with the parts of __array_priority__ that are in the binop mechanism, (iii) surprising people who expect their __r*__ are called.

(The related other issues: gh-3812, gh-3856, gh-3375, gh-4766, gh-3502, gh-4709)

@njsmith
Copy link
Member

njsmith commented Jun 19, 2014

The Python binop mechanism still has priority over the numpy one in the
sense that the usual rules still have to apply before ndarray.add (or
radd) get called -- so for purposes of justifying this conceptually we
have the option of just saying that this is how ndarray implements add, and
it's the other class's job to figure out how to interoperate with that if
they want to :-).

I agree that the more important issue is compatibility. The above reasoning
actually seems totally reasonable to me in a world where numpy had always
implemented numpy_ufunc -- anyone who wants to implement ndarray +
mytype has always had to have special knowledge of how ndarrays work, so
asking them to implement numpy_ufunc is reasonable.

Of course we aren't in this that world. But I was surprised at your comment
that this special case is only triggered if numpy_ufunc is defined --
I would have expected backwards-compat cases to be triggered when
numpy_ufunc isn't defined :-).

As a straw man, which cases would break if we say (1) if the other operand
defines numpy_ufunc, then we skip all the other checks and just
dispatch to np.add and let the numpy_ufunc dispatch take care of
things, (2) otherwise we do whatever we used to do historically (checking
for array_priority etc.)? (And as a lower priority item, possibly we
eventually want to deprecate some of those things in favor of
numpy_ufunc.)

On Thu, Jun 19, 2014 at 4:54 PM, Pauli Virtanen notifications@github.com
wrote:

@njsmith https://github.com/njsmith: yes, if we feel that
numpy_ufunc should have priority over the Python binop mechanism,
that is a possible --- and simpler --- design route. I think I'm backing
out from my opposition to this, in light of the hacks that avoiding it
seems to require.

I don't immediately recall what will break if this mechanism is simply
ripped out and my PR gh-3856 #3856
is reverted. Potential points of failure: (i) operations with classes that
don't define numpy_ufunc, (ii) interference with the parts of
array_priority that are in the binop mechanism, (iii) surprising
people who expect their r* are called.

(The related other issues: gh-3812
#3812, gh-3856
#3856, gh-3375
#3375, gh-4766
#4766, gh-3502
#3502, gh-4709
#4709)


Reply to this email directly or view it on GitHub
#4815 (comment).

Nathaniel J. Smith
Postdoctoral researcher - Informatics - University of Edinburgh
http://vorpus.org

@mhvk
Copy link
Contributor Author

mhvk commented Jun 19, 2014

@njsmith - to add to the complications, one should also consider the case where self has __numpy_ufunc__ (which is the case for the example at the top).

@mhvk
Copy link
Contributor Author

mhvk commented Jun 19, 2014

@pv, @njsmith - as I mentioned above, at least my problem gets solved if the handover to other only gets done if self does not have __numpy_ufunc__ itself. I do think this is more logical (if one wants to keep the functionality at all). With appropriate changes to the tests, this passes (though those test changes should be made neater and better explained if it is decided to go forward with this).

@mhvk
Copy link
Contributor Author

mhvk commented Jun 19, 2014

For completeness: Even with my changes, I'm still bitten by #4766

EDIT: being bitten is not for my example above, but for astropy/astropy#2583

@mhvk
Copy link
Contributor Author

mhvk commented Jun 20, 2014

FYI, the change to let pass on to other.__numpy_ufunc__ only if self doesn't have it does now pass travis (the one failure is a false alarm).

Not quite sure what to think about the larger question, though I'm wondering similarly to @njsmith whether it makes sense to let priorities over which object gets the call depend on whether __numpy_ufunc__ is present -- it would seem more logical if its function only is to help let the ufunc call decide how to be executed.

@pv
Copy link
Member

pv commented Jul 24, 2014

One aspect that should be considered is that

ndarray * usercls

needs to call usercls.__rmul__(ndarray) and not usercls.__numpy_ufunc__(...). For example for Scipy, usercls is a sparse matrix for which * means matrix multiply, so it should not dispatch to np.multiply. This rules out the approach (1) @njsmith suggests above.

The suggestion by @mhvk of using if hasattr(other, '__numpy_ufunc__') and not hasattr(self, '__numpy_ufunc__') and hasattr(other, '__rmul__'): return NotImplemented is OK for ndarrays, but can cause problems for subclasses implementing __numpy_ufunc__.

The current __array_priority__ approach (which is what makes things work as expected for Scipy sparse matrices currently) also has limitations --- if you try multiplying MaskedArray * spmatrix, it produces garbage as ma has higher array priority. Moreover, it's also broken in other cases gh-4766.

***

A core question seems to be that if * has different meanings for classes A and B, and both can perform the * operation using data from the other, what does

A * B
B * A

mean? Does the LHS get to decide? Or is there a commutative decision rule (as in array_priority)? Or should the decision be semi-commutative (as in @mhvk suggestion) in the sense that if B.__numpy_ufunc__ would get called, prefer calling B.__rmul__ instead if present?

@mhvk
Copy link
Contributor Author

mhvk commented Jul 24, 2014

@pv - in your example, I would think that the sequence would be as follows: ndarray wants to multiply, but discovers that it does not know how to multiply with the sparse matrix class and thus returns NotImplemented. Then, the sparse matrix class gets a shot, which does know how to do it.

Here, in principle, __numpy_ufunc__ should not have to be involved at all, but I guess in practice it is anyway, since there is no other clean way to tell ndarray that "I look like a subclass but do not exactly behave like it". So, in that sense the approach as currently implemented works fine: ndarray notices that usercls has __numpy_ufunc__ and returns NotImplemented.

But obviously it causes the problems that led to this issue... I'm not quite sure why you think there is a problem with my approach in general: essentially, if the first class also has __numpy_ufunc__ it tells the C code here that it will deal with other classes itself. Obviously, this does imply that its __numpy_ufunc__ has to carefully check that it can deal with the other class, again returning NotImplemented if it doesn't know what to do.

I do worry about your specific example of multiplication having different meaning. E.g., suppose I multiply an astropy Quantity with a sparse matrix. In my scheme, Quantity.__numpy_ufunc__ gets called, which will turn the quantity into an ndarray and then do ufunc(ndarray, other). Obviously, this gives other its change to use its override of ufunc, but as you wrote, by this time the ufunc itself has been decided: it is multiply, and SparseMatrix.__numpy_ufunc__ has somehow to realise this, and return NotImplemented, so that still sparse_matrix.__rmul__(quantity) will still get called. Would this be possible in the SparseMatrix context? (Alternatively, SparseMatrix.__numpy_ufunc__ could deal with it by doing a proper matrix multiplication instead of multiply but this seems harder and more error-prone.)

@pv
Copy link
Member

pv commented Jul 24, 2014

@mhvk: yes, this issue was considered earlier when adding the hasattr(other, '__numpy_ufunc__') rule, so it works currently.

The problem with the approach your propose is that it will dispatch to np.multiply from __mul__ if both classes implement __numpy_ufunc__, and after that the ufunc simply has to do its job, which is elementwise multiplication.

I think the responsibility of dealing with this problem lies with the binary op layer, and not the ufunc layer, because the decision is about what operation * maps to. After the execution has been handed over to np.multiply I think the decision has been made.

The problem is also not fully related to __numpy_ufunc__. Another example is np.matrix, which currently works around the same issue via __array_priority__, which then has a number of other shortcomings (MaskedArray * matrix and matrix * MaskedArray give different results).

It is in principle possible to add informative markers such as np.multiply(a, b, _binop=True) to ufuncs that get passed on to the overrides. But I have a feeling that this is then just pushing the problem elsewhere --- the decision on what e.g. * still has to be made at some point, and I suspect pushing it to __numpy_ufunc__ will encounter similar problems as here (because the __numpy_ufunc__ decision making mechanism is the same as in Python binop overrides).

@mhvk
Copy link
Contributor Author

mhvk commented Jul 24, 2014

@pv - OK, that makes sense. I agree that this really has to be done at the operator level: SparseMatrix has to be able to tell that it cannot be other in a multiply. But then perhaps this also is a problem we should not try to solve in this PR, which is to make the __numpy_ufunc__ functionality work right... If you agree, were there other problems you had in mind when you mentioned before that you thought my approach " is OK for ndarrays, but can cause problems for subclasses implementing __numpy_ufunc__"?

@pv
Copy link
Member

pv commented Jul 24, 2014

The change in this PR as it is now makes Quantity * spmatrix be elementwise and spmatrix * Quantity a matrix product, so it will introduce new problems.

I think, unless we want to simply use the __array_priority__ mechanism that has been there up to now, a more specific solution to this ndarray-subclass problem would be to check if other.__rmul__ is ndarray.__rmul__. If it is ndarray.__rmul__, then the desired behavior is definitely to fall back onto np.multiply.

@pv
Copy link
Member

pv commented Jul 24, 2014

Here's one suggestion: https://github.com/pv/numpy/compare/numpyufunc-binop-subclass-fix

At first sight this may look like it's munging with CPython implementation details, but the added checks can be expressed also as other.__class__.__rop__ is self.__class__.__rop__.

Do we tolerate this? Does this open other cans of worms?

juliantaylor added a commit to juliantaylor/numpy that referenced this pull request Jul 29, 2014
The ufunc override `__numpy_ufunc__` still has a few unresolved issues
regarding its behavior towards Python operators see numpygh-4815.
To avoid releasing numpy with an interface that might change in the next
numpy version and to not further delay the 1.9 release it has been
decided to disable the feature in 1.9.0.
@charris charris modified the milestones: 1.9 blockers, 1.10 blockers Aug 22, 2014
@charris
Copy link
Member

charris commented Aug 22, 2014

Changed this to 1.10 blocker as __numpy_ufunc__ has been removed from 1.9.

@mhvk
Copy link
Contributor Author

mhvk commented Sep 18, 2014

@pv - as I've restarted working on Quantity.__numpy_ufunc__, I now finally checked that your suggested method solves the problems I had. It seems to me it is an elegant solution, so would suggest it be considered instead of mine.

@mhvk
Copy link
Contributor Author

mhvk commented Oct 10, 2014

@pv - it would be good if we could solve this issue for assigning which operand gets to use its __numpy_ufunc__ for binops. As I wrote 3 weeks ago, I think your solution is much more elegant. To move forward with that, will you make it into a PR? In principle, I could also change the PR here to use it, but as I'm much less familiar with the code, that does not seem optimal.

@mhvk
Copy link
Contributor Author

mhvk commented Oct 27, 2014

@pv, @njsmith - what would you recommend as the path forward on this issue? For astropy's Quantity, the introduction of __numpy_ufunc__ is fantastic (factor two speedup as we don't have to discard one ufunc calculation), but it would really help if there were no order-of-arguments dependence of the treatment of subclasses. In principle, I think either my approach or that of @pv are OK, with the latter perhaps better.

@mhvk
Copy link
Contributor Author

mhvk commented Jan 19, 2015

Checking 1.10 blockers brought up by my __numpy_ufunc__ trials: the issue here remains.

@mhvk
Copy link
Contributor Author

mhvk commented Mar 10, 2015

@pv, @njsmith - still wondering how to move this forward with this __numpy_ufunc__ blocker, in particular what route to take to resolving this. As is, we have two options:

  1. Only let RHS.__numpy_ufunc__ take over (breaking normal rules) if self does not also have __numpy_ufunc__ (PR here)
  2. Only let RHS.__numpy_ufunc__ take over if it defines a reverse method that is different from that of self (code from @pv).

In the end, both try to express the cases where RHS should be able to override normal handling but which are not captured by the normal ufunc mechanism. If people agree @pv's approach is the better one, I'd be happy to turn it into a PR.

@mhvk
Copy link
Contributor Author

mhvk commented Apr 5, 2015

@pv - assuming that the main reason there was no progress on this for a while was you not having the time, I went ahead and rebased your solution, adding a test case, and made it into a new PR (#5748).

@charris charris modified the milestones: 1.10.0 release, 1.10 blockers Apr 22, 2015
@charris
Copy link
Member

charris commented Apr 24, 2015

@mhvk do you intend for #5748 to replace this?

@mhvk
Copy link
Contributor Author

mhvk commented Apr 25, 2015

@charris - for my purposes, either this PR or #5748 would do (where the latter is really the work by @pv). There is a bit more discussion about what's the right approach in #5748; if we do return to this PR, I would obviously need to add test cases, remove print statements, etc.

@charris
Copy link
Member

charris commented May 6, 2015

@mhvk @pv As I see it, Pauli addresses the python interpreter operator problems at the operator level, which I think is were it belongs. I'm inclined to close this and go with #5748. Is it correct that ufuncs still have problems? I note in trying astorpy.units that np.multiply(42, u.meter) returns NotImplemented.

@mhvk
Copy link
Contributor Author

mhvk commented May 6, 2015

@charris - yes, fine to go with #5748; I'll close this PR.

It is also fine that np.multiply(42, u.meter) returns NotImplemented -- we just want to support 42 * u.m (or <some-array> * u.m) and that is done well enough via the unit's __rmul__; I don't think the numpy functions have to be able to handle that independently.

@mhvk mhvk closed this May 6, 2015
charris added a commit that referenced this pull request May 6, 2015
BUG Numpy ufunc binop fix for subclasses with __numpy_ufunc__ (closes #4815)
@mhvk mhvk deleted the operator_numpy_ufunc_priority branch February 19, 2017 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants