Skip to content

ENH: Make no unshare mask future warnings less noisy #7187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

seberg
Copy link
Member

@seberg seberg commented Feb 4, 2016

Also fixes a small bug in flat setting (which is utterly broken
in any case though)

xmask = x._mask = y._mask = ymask = common_mask
x._sharedmask = False
y._sharedmask = False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thought I would remove it, unshare mask does nothing except setting _sharedmask to false, since the mask is overwritten anyway. Removing this means that the __setitem__ call to unshare_mask is the only remaining call in numpy. Coupld possibly be deprecated itself when this is done.

@seberg seberg changed the title ENH: Make future warnings less noisy ENH: Make no unshare mask future warnings less noisy Feb 4, 2016
# TODO: This is messed up, since there is no guarantee for
# ravel to return views into either array or mask.
# If self._sharedmask, we still do want self._mask to change here:
y._sharedmask = False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I should leave this, the result changing based on sharedmask seems nonsense, but it does change behaviour once in a while. On the other hand we would have to futurewarn this on its own.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, removed this fix for now, to be "minimal". Als omaybe the warning gives at least some hint of the weird behaviour that goes on and it will be automatically fixed with the other stuff. (only the sharedmask bug, not the one about non-contiguous arrays).

@@ -3171,7 +3164,7 @@ def __getitem__(self, indx):
dout._mask = _mask[indx]
# set shape to match that of data; this is needed for matrices
dout._mask.shape = dout.shape
dout._sharedmask = True
dout._sharedmask = np.may_share_memory(_mask, dout._mask)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tihs line could in principle change a result, if you do something crazy like:

arr = masked_arr[[0, 1, 2, 3]]  # not a view
view = arr[::2]
arr[4] = 5.  # will change view with change, triggers copy without change
view2 = arr[::4]  # view after
arr[6] = 6.  # would change view2, since copy was triggered.

so maybe could undo so that a warning is given. The logic is designed to avoid back propagation to the parent, since that creates copies at an arbitrary later time, it can do do weird things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I guess I could undo this, in favor of doing that slightly silly refcount + owndata check below, which should find it. I think I will do that.

"ways.\n"
"To get the future behaviour and silence the warning, "
"set `ma_arr._sharedmask = False`.",
FutureWarning, stacklevel=2)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the private attribute is ugly, but the only way I can think of. Silencing the warning even more, seems only possible by starting to do reference count checks (i.e. refcount of mask == 1 and it has no base -> no copy necessary)

@seberg
Copy link
Member Author

seberg commented Feb 5, 2016

@njsmith you seem to have an overview who got hit hard by this, can you maybe ask them if this helps enough? It reduces the numpy tests failures to around 15 or so, it might clear up almost all things, or maybe they use masked views a lot and it is still extremely noisy....

@seberg
Copy link
Member Author

seberg commented Feb 5, 2016

I will fix the numpy "test failures" by setting that attribute or slencing them in my warnings cleanup PR.

@njsmith
Copy link
Member

njsmith commented Feb 5, 2016

@njsmith you seem to have an overview who got hit hard by this, can you maybe ask them if this helps enough?

Basically what I know is that I ran the statsmodels and matplotlib test suites against master and counted the warnings :-) Something like

virtualenv testenv
testenv/bin/pip install path/to/numpy statsmodels nose
testenv/bin/python -c "import statsmodels.api as sm; sm.test()" 2>&1 | tee statsmodels.log
git clone matplotlib/matplotlib
cd matplotlib
../testenv/bin/python setup.py build_ext -i
../testenv/bin/python tests.py 2>&1 | tee ../matplotlib.log
cd ..
grep "masked array" statsmodels.log | wc -l
grep "masked array" matplotlib.log | wc -l

@seberg
Copy link
Member Author

seberg commented Feb 5, 2016

That means I have to figure out how to run statsmodels nose ;). I made the changes more minimal by checking the refcounts now, I think this is about as good as it can in the sense of minimal invasive and as fully precise warning.

@njsmith
Copy link
Member

njsmith commented Feb 5, 2016

@seberg: That's why I just told you how to run statsmodels's tests :-P

@charris charris added this to the 1.11.0 release milestone Feb 5, 2016
@seberg
Copy link
Member Author

seberg commented Feb 5, 2016

I never let fact go into the way of my reasoning :p. Don't get any warnings with this applied, but maybe my quick hack test (or statsmodel version 0.5.0) were wrong/not sensitive. Saw only 2 with master? I had the impression I should see more.

@seberg
Copy link
Member Author

seberg commented Feb 5, 2016

OK, I had to add an "always" filter and now on master (I guess it might be different with python -c). Then I see the spam. Anyway, it is fully gone with this applied -> statsmodels is clean (which makes sense, the errors may even have come from inside numpy, which uses masked arrays sometimes for small nanfuncs).

Hmmm, some wheel issue with f2py cropped up in the test, seems unrelated.

@charris
Copy link
Member

charris commented Feb 5, 2016

Yeah, the latest venv release (5 in two weeks) with yet another setuptools upgrade is probably responsible. Grrr, we may just need to pin setuptools at some point. Or nuke it from orbit.

(sys.getrefcount(_mask) == 3) and _mask.flags.owndata):
# 2016.01.15 -- v1.11.0
warnings.warn(
"setting an item on a masked array which has a shared "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a big warning without line breaks, it would be better if it was shorter with maybe a reference to the release notes for extended explanation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do it (with release notes changes), but I am not the best writer for this stuff....

Also fixes a small bug in flat setting (which is utterly broken
in any case though)
@homu
Copy link
Contributor

homu commented Feb 9, 2016

☔ The latest upstream changes (presumably #7194) made this pull request unmergeable. Please resolve the merge conflicts.

@charris charris removed this from the 1.11.0 release milestone Feb 9, 2016
@charris
Copy link
Member

charris commented Feb 9, 2016

@seberg I went with #7194 for b3.

@seberg
Copy link
Member Author

seberg commented Feb 10, 2016

Sure, though it might be good to have a quick look here whether there is something that was missed before release.

@seberg
Copy link
Member Author

seberg commented Feb 21, 2016

@charris, Chuck, should I have a look at this today? I think at least the corrcoef fix still needs to be put in. Sorry if this is annoying you. It would be nice if someone who is not me can decide if some of the other differences should be put in. I don't want to assert my opinion here, but also feel it likely did not get the careful check it should have....

@njsmith
Copy link
Member

njsmith commented Feb 21, 2016

@seberg: I haven't had the time and attention to follow all these details properly, so I would certainly feel better knowing that you have checked it over :-)

@seberg
Copy link
Member Author

seberg commented Feb 28, 2016

Closing this for gh-7363, but feel free to close it if nobody cares anymore. This is my last bid at giving a shit, this is masked arrays after all. Giving the warning for something like np.nanmedian(np.full(10, np.nan).reshape(5, 2), axis=1) is pretty evil though ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants