BUG: Masked scalar comparison returns float #4332

abalkin · 2014-02-20T15:10:36Z

>>> (numpy.ma.array(5.5, mask=True) > 0).dtype
dtype('float64')

Moreover, the result has float dtype even if the masked scalar is an int:

>>> (numpy.ma.array(5, dtype='i', mask=True) < 0).dtype
dtype('float64')

The problem is not present in oldnumeric:

>>> (numpy.oldnumeric.ma.array(5.5, mask=True) < 0).dtype
dtype('bool')

>>> numpy.__version__
'1.8.0'

The text was updated successfully, but these errors were encountered:

abalkin · 2014-02-20T16:02:11Z

Apparently the problem is with this logic

            if result.shape == () and m:
                return masked

in MaskedArray.__array_wrap__. Any scalar result from ufunc gets replaced with the masked singleton which has

>>> numpy.ma.masked.dtype
dtype('float64')

So the issue is not limited to comparison operators:

>>> -numpy.ma.array(5, mask=True) is numpy.ma.masked
True

Removed logic replacing masked scalar results from ufuncs with ma.masked singleton which has dtype float64. Fixes numpy#4332.

pierregm · 2014-02-22T16:22:36Z

The whole idea is indeed to return a specific value, the masked singleton, however it is defined. That way, one can test whether an item of a ndarray is masked by simply checking item is masked.

The reason why it worked with oldnumeric is that the masked singleton was defined as np.ma.MaskedArray(False, dtype=bool). I'd be more willing to have this changed back than having __array_wrap__ no longer returning the np.ma.masked value. I'm afraid the current PR is such a redical change it will break things down the road.

abalkin · 2014-02-22T22:00:22Z

The reason why it worked with oldnumeric is that the masked singleton
was defined as np.ma.MaskedArray(False, dtype=bool).

No,

 >>> numpy.oldnumeric.ma.masked.dtype
 dtype('int32')

but

 >>> (numpy.oldnumeric.ma.masked > numpy.oldnumeric.ma.masked).dtype
 dtype('bool')

abalkin · 2014-02-22T22:06:45Z

Would it be less radical if we preserve the existing behavior when result dtype is float, but return the correct type otherwise?

pierregm · 2014-02-23T14:53:41Z

@abalkin Can you give an example of what you have in mind?

abalkin · 2014-02-23T16:52:42Z

@pierregm

--- a/numpy/ma/core.py
+++ b/numpy/ma/core.py
@@ -2841,7 +2841,7 @@ class MaskedArray(ndarray):
                     # Don't modify inplace, we risk back-propagation
                     m = (m | d)
             # Make sure the mask has the proper size
-            if result.shape == () and m:
+            if result.shape == () and m and m.dtype == masked.dtype:
                 return masked
             else:
                 result._mask = m

I am not sure backward compatibility is such a big concern here. I understand that some people like a[i] is ma.masked idiom, but I am yet to see anyone writing a.sum() is not ma.masked.

In the long run, I would really like to see masked singleton use deprecated. As numpy type system getting reacher, it will cause more and more problems.

charris · 2014-02-24T23:51:09Z

Labelling a bug until the conversation concludes.

pierregm · 2014-02-25T00:19:08Z

Returning the masked singleton was a design choice since numarray. I understand it can be frustrating, that a better solution needs to be implemented (once a consensus will be reached on missing/masked data...), that testing x is np.ma.masked is a bit weird (on another hand, testing x is None is quite common)... Nevertheless, I would first try to set np.ma.masked=MaskedArray(False,mask=True, dtype=bool), see whether it breaks anything in numpy and matplotlib before using your solution (because there's a good chance that m.dtype will never be masked.dtype in a generic case).
Note #1: OK, I see the comments in L5673/L5764 about precedence, so there must have been a good reason at the time, but I can't remember which one..
Note #2: another possibility with masked scalars is to have a comparison return False all the time (like np.nan does). But that's a radical change too...

abalkin · 2014-02-25T03:53:35Z

I would first try to set np.ma.masked=MaskedArray(False,mask=True, dtype=bool), see whether it breaks anything

Do you recall the reason for changing ma.masked dtype from int32 to float between oldnumeric.ma and numpy.ma?

abalkin added a commit to abalkin/numpy that referenced this issue Feb 20, 2014

BUG: Non-bool result when comparing ma scalars.

6d569ae

Removed logic replacing masked scalar results from ufuncs with ma.masked singleton which has dtype float64. Fixes numpy#4332.

abalkin mentioned this issue Feb 20, 2014

BUG: Non-bool result when comparing ma scalars. #4335

Closed

charris added Defect labels Feb 24, 2014

ahaldane mentioned this issue May 7, 2016

np.ma.masked is not a scalar #7588

Open

kabaka0 mentioned this issue Jul 3, 2017

FromScalar makes no sense with mask gorgonia/gorgonia#132

Closed

chewxy mentioned this issue Sep 17, 2017

FromScalar makes no sense with mask gorgonia/tensor#7

Open

mattip removed the priority: normal label Oct 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Masked scalar comparison returns float #4332

BUG: Masked scalar comparison returns float #4332

abalkin commented Feb 20, 2014

abalkin commented Feb 20, 2014

pierregm commented Feb 22, 2014

abalkin commented Feb 22, 2014

abalkin commented Feb 22, 2014

pierregm commented Feb 23, 2014

abalkin commented Feb 23, 2014

charris commented Feb 24, 2014

pierregm commented Feb 25, 2014

abalkin commented Feb 25, 2014

BUG: Masked scalar comparison returns float #4332

BUG: Masked scalar comparison returns float #4332

Comments

abalkin commented Feb 20, 2014

abalkin commented Feb 20, 2014

pierregm commented Feb 22, 2014

abalkin commented Feb 22, 2014

abalkin commented Feb 22, 2014

pierregm commented Feb 23, 2014

abalkin commented Feb 23, 2014

charris commented Feb 24, 2014

pierregm commented Feb 25, 2014

abalkin commented Feb 25, 2014