Skip to content

No way to mark "NotImplemented" for comparison operations on subclasses #4709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mdboom opened this issue May 14, 2014 · 12 comments · Fixed by #4711
Closed

No way to mark "NotImplemented" for comparison operations on subclasses #4709

mdboom opened this issue May 14, 2014 · 12 comments · Fixed by #4711

Comments

@mdboom
Copy link
Contributor

mdboom commented May 14, 2014

In astropy, we have a Quantity class which is a subclass of ndarray for physical quantities. One of its features is that if you compare two of them and the units aren't equivalent, they compare to false.

In Numpy master (but not Numpy 1.8.x), as of 9b8f6c7 (@seberg), it seems there might no longer be a way to implement this, since raising an exception in __array_prepare__ bubbles up through ndarray.richcompare (it currently emits a DeprecationWarning, but I understand will eventually pass the original exception through). Since __array_prepare__ is only able to return an instance ndarray (or subclass), there doesn't seem to be a way to flag to the comparison operator that what we really want is NotImplemented. Overriding the __eq__ operators, etc., doesn't seem to be sufficient, since it can't handle the case where there is a (regular) array on the left hand side.

Here's a minimal script to reproduce the issue (the actual code is much more complex, but this boils it down to the essence of the problem):

import warnings
warnings.filterwarnings("error", ".*", DeprecationWarning)

import numpy as np


class Quantity(np.ndarray):
    def __repr__(self):
        return '<{0} {1}>'.format(
            np.ndarray.__repr__(self), self._unit)

    def __new__(cls, value, unit, dtype=None):
        value = np.array(value, dtype=dtype)
        value = value.view(cls)
        value._unit = unit
        return value

    def __array_prepare__(self, obj, context=None):
        function = context[0]
        args = context[1][:function.nin]

        if ((not isinstance(args[0], Quantity) or args[0]._unit != self._unit) or
            (not isinstance(args[1], Quantity) or args[1]._unit != self._unit)):
            raise ValueError("Units don't match")

        if not isinstance(obj, np.ndarray):
            obj = np.array(obj)
        view = obj.view(Quantity)
        view._unit = self._unit
        return view

    # def __eq__(self, other):
    #     try:
    #         super(Quantity, self).__eq__(other)
    #     except (DeprecationWarning, ValueError):
    #         return NotImplemented


q = Quantity([10.0, 20.0], 'm')
print 10 == q
print np.int64(10) == q

The output on Numpy 1.8.x is:

False
False

The output on Numpy master is:

Traceback (most recent call last):
  File "quantity_compare.py", line 40, in <module>
    print 10 == q
DeprecationWarning: elementwise comparison failed; this will raise the error in the future.

If you uncomment the __eq__ function above, the first test passes, but the second still fails.

Is there a workaround here, or something I'm missing?

@mhvk, @astrofrog

@njsmith
Copy link
Member

njsmith commented May 14, 2014

Query: so I get the feeling that you guys are probably pretty convinced at this point that dealing with numpy's subclassing machinery is a huge hassle. And on the numpy side, trying to keep duct-taping these things together so astropy keeps working is becoming quite a challenge (see also the whole mess with median calling mean). And in the long run, I don't see this incrementally converging on any sustainable system -- instead, IMO, sooner or later you'll end up throwing out the subclass implementation and switching to one based on custom dtypes.

The blocker for this is that numpy doesn't yet have sufficiently robust support for parametrized dtypes to make this possible. And it could (again IMO), but no-one currently has time to make it happen. Any chance any of you guys would be interested in helping with that? It would require more effort upfront, but I'd be happy to provide design and review effort on the numpy side, the changes would have a bunch of other beneficial side-effects (e.g. we'd also much better support for missing data and categorical data), and it would solve all these quantity problems once and for all.

@pv
Copy link
Member

pv commented May 14, 2014

Does just overriding __eq__ et al. work here? (With isinstance checks as appropriate, so that you never enter ndarray.__eq__ if units are not compatible.)

Per Python subclassing rules, shouldn't in x == y it call y.__eq__(x) first?

@pv
Copy link
Member

pv commented May 14, 2014

Ok, I see --- the subclass rule is not enough because Numpy scalars are not array subclasses, but aggressively compare all ndarrays. That can probably be worked around by overriding np.equal via __numpy_ufunc__.

A better fix in Numpy might be to make Numpy scalar comparisons return NotImplemented with ndarrays, so that the right-hand operation gets a chance to run.

@seberg
Copy link
Member

seberg commented May 14, 2014

The subclass rule should be enough here, at least unless other subclasses with higher priority enter the picture. I think we fixed __eq__ priority for arrays in 1.9 too (maybe 1.8) so maybe we should check if we can't fix it for scalars as well, they should behave like arrays for operators but apparently don't for ==.

The idea of the change was that showing the error would make more sense then just returning False, so yeah, there is likely no easy way to get around this, though I could imagine some __array_wrap__ context inspection can do it (and __numpy_ufunc__ of course too).

@pv
Copy link
Member

pv commented May 14, 2014

It fails for scalars, because scalars call tp_richcompare directly without doing the subclass check, and the ndarray tp_richcompare doesn't second-guess (which is the correct behavior).

@pv
Copy link
Member

pv commented May 14, 2014

Fix in gh-4711

@pv
Copy link
Member

pv commented May 14, 2014

Masked arrays and matrices seem to still be a problem :/

The __numpy_ufunc__ code path for this case deals OK with the issue, but the old __array_prepare__ etc. stuff doesn't.

@pv pv reopened this May 14, 2014
@mdboom
Copy link
Contributor Author

mdboom commented May 15, 2014

Thanks everyone for looking into this. If there's a low-hanging solution, that would be great. I agree that in the long run, the dtype solution is superior, but ideally (well, selfishly for us), it would be nice to have such a solution in place before these changes are released (and I want a pony, too) 😉.

@abalkin
Copy link
Contributor

abalkin commented May 15, 2014

I believe python stdlib has a similar issue in the datetime module. If so, I doubt there is a "low-hanging solution" since that bug has not been resolved in 5 years.

@pv
Copy link
Member

pv commented May 15, 2014

One solution is to just switch from __array_prepare__ to __numpy_ufunc__ (both can be present in the same class, but the former is not used if the latter is present). This doesn't however pass the execution to __eq__ in the case ndarray-subclass == Quantity if the subclass __eq__ is written to aggressively to cast, but it goes directly via the np.equal ufunc.

I'm not exactly sure how this situation with two subclasses that don't know about each other is supposed to be handled in Python's binop override mechanism. Clearly some sort of co-operativity is required, but I'm not sure what is the best practice.

@pv
Copy link
Member

pv commented May 15, 2014

One option to work around this would be to change the ndarray base class binop implementation to the C equivalent of

def __somebinop__(self, other):
    if type(self) is not np.ndarray:
        return operator.binop(np.asarray(self), other).view(type(self))
    else:
        # as previously

and be careful to do the same in MaskedArray binop implementations. Not sure if this has unintended consequences, however.

@mattip
Copy link
Member

mattip commented Dec 7, 2022

I think we can close this. Please reopen if there are still open questions around __array_prepare__ and comparisons,

@mattip mattip closed this as completed Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants