Skip to content

Should np.ma.masked be hashable? #4660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
hamogu opened this issue May 3, 2014 · 8 comments
Open

Should np.ma.masked be hashable? #4660

hamogu opened this issue May 3, 2014 · 8 comments

Comments

@hamogu
Copy link

hamogu commented May 3, 2014

It seems that in python 2.7 I can do a = {np.ma.masked: 'AAA'} while in python3 this code throws TypeError: unhashable type: 'MaskedConstant' (numpy 1.8. in both cases).

@jjhelmus
Copy link
Contributor

np.ma.masked is an instance of the np.ma.MaskedArray class and therefore is not hashable in Python 3 for the same reason why all instances of np.ma.MaskedArray or even np.ndarray are not hashable, their contents can change over time.

# Python 3.4, NumPy 1.9.3
import numpy as np
a = np.ndarray(1)
b = np.ma.MaskedArray()
hash(a)
# TypeError: unhashable type: 'numpy.ndarray'
hash(b)
# TypeError: unhashable type: 'MaskedArray'
hash(np.ma.masked)
# TypeError: unhashable type: 'MaskedConstant'

How can you change the contents of np.ma.masked? Easy, just modify the _data or _mask attributes. But please don't actually do this, bad things may result.

print(np.ma.masked._data)
# 0.0
np.ma.masked._data[()] = 1
print(np.ma.masked._data)
# 1.0

@ahaldane
Copy link
Member

The example a = {np.ma.masked: 'AAA'} shouldn't work in python2 either, right? It doesn't work on my machine with 1.10.

@jjhelmus
Copy link
Contributor

@ahaldane Correct, a = {np.ma.masked: 'AAA'} should not work in either Python 2 or 3 since np.ma.masked is mutable.

From a git bisect is looks like PR #5326 fixed the old broken behavior where np.ma.masked was hashable and allowed to be a dictionary key. After this fix the example will raise a TypeError which I think is the correct behavior.

This behavior is not covered by any unit tests, it may be prudent to add tests along the lines of:

import collections

def test_ma_collections_hashable():
    x = np.ma.MaskedArray([])
    assert not isinstance(x, collections.Hashable)

    y = np.ma.masked
    assert not isinstance(y, collections.Hashable)

Such tests might currently fail in Windows due to issue #5647.

@taldcroft
Copy link

taldcroft commented Jun 4, 2017

Coming back to this because of astropy/astropy#6153 (comment), I still don't understand why np.ma.masked should not be hashable. It is an instance of MaskedConstant, which strongly implies that it is not mutable (even if it earlier derives from MaskedArray). What is the point of naming something a constant if it can be changed?

Indeed if you try to change it via public methods an exception is raised. (I'm not concerned that there is a back-door private attribute, if somebody messes with np.ma.masked in that way then they get what they deserve).

In [7]: np.ma.masked[()] = 10
---------------------------------------------------------------------------
MaskError                                 Traceback (most recent call last)
<ipython-input-7-a4d285aabd96> in <module>()
----> 1 np.ma.masked[()] = 10

/Users/aldcroft/anaconda3/envs/astropy/lib/python3.5/site-packages/numpy/ma/core.py in __setitem__(self, indx, value)
   3227         """
   3228         if self is masked:
-> 3229             raise MaskError('Cannot alter the masked element.')
   3230         _data = self._data
   3231         _mask = self._mask

MaskError: Cannot alter the masked element.

@eric-wieser
Copy link
Member

eric-wieser commented Sep 28, 2017

Easy, just modify the _data or _mask attributes.

This no longer works

I still don't understand why np.ma.masked should not be hashable

I can see one argument: it represents an unknown value, and an unknown value has an unknown hash.

@eric-wieser eric-wieser added the component: numpy.ma masked arrays label Sep 28, 2017
@mhvk
Copy link
Contributor

mhvk commented Sep 28, 2017

I can see one argument: it represents an unknown value, and an unknown value has an unknown hash.

but np.nan can be used as a key. Since we often write <something> is masked, I think it is reasonable for it to be assumed constant and thus hashable.

@endolith
Copy link
Contributor

endolith commented Aug 3, 2019

Somewhat on-topic: Would it make sense that when an array's flag.writable is set to False, that the array becomes hashable, since it can no longer be modified? (My immediate goal is just to memoize functions that take numpy arrays as inputs.)

@eric-wieser
Copy link
Member

That's an interesting idea, but unfortunately doesn't work too well - I can create a read-only view of a writeable array, and the hash will vary with the source array

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants