ENH: Avoid memory peak and useless computations when printing a MaskedArray. #6748

saimn · 2015-11-30T08:41:14Z

Ref #3544. When printing a MaskedArray, the whole array is converted to the object dtype, whereas only a few values are printed to screen.
The approach here is to cut the array along each axis and keep only a subset that it used for the string conversion. This way the output should not change. The shape used for the cut (100 values for each axis) was chosen so we have enough values when printing on large screen (and the number of printed values depend on the dtype if I understand correctly), maybe there is a better value to choose (inspired from what ndarray.str does ?).
Maybe there is a better approach, in which case I am happy to improve this PR.

…dArray. Ref numpy#3544. When printing a `MaskedArray`, the whole array is converted to the object dtype, whereas only a few values are printed to screen. So the approach here is to cut the array and keep only a subset that it used for the string conversion. This way the output should not change.

charris · 2015-12-01T20:35:02Z

I'm curious about this

# convert to object array to make filled work

Without looking through the whole file, I'm guessing this is to allow printing --- for masked values. If we could avoid the object conversion that would be a better solution. Probably more work though...

seberg · 2015-12-01T20:57:21Z

Holy, wow. that is the ugliest thing, but I think that might just solve gh-6723, or maybe it was always used like that?

seberg · 2015-12-01T21:15:50Z

Nvm (I know it probably did not make sense anyway to anyone ;)), but I got it the wrong way around, ithought the fill value was the problem, but the array is....

saimn · 2015-12-01T21:25:02Z

Without looking through the whole file, I'm guessing this is to allow printing --- for masked values. If we could avoid the object conversion that would be a better solution. Probably more work though...

Yes, I don't know what are the other possibility to fill the masked values with -- without doing it manually.
So it is still very ugly but at least it will not eat all the memory (I was always surprised that it was so long to print big arrays).

charris · 2015-12-01T22:04:04Z

There is a back compatibility problem with getting rid of the singleton masked_print_option but we could probably do it. I think it should be possible to dispense with it with a bit of cleverness. However, this looks like a good workaround for what we have now. It would be good to use a more descriptive name for nval and make it a class variable with an explanation of what it determines. That way when someone complains that it is the wrong value it will be easy to find and change. Could maybe call it _print_width or some such.

saimn · 2015-12-01T22:40:41Z

@charris : ok, done. Is it enough with a comment next to the class variable ?

charris · 2015-12-01T22:53:21Z

numpy/ma/core.py

+                    # object dtype, extract the corners before the conversion.
+                    for axis in range(self.ndim):
+                        if data.shape[axis] > self._print_width:
+                            ind = np.int(self._print_width / 2)


Do ind = self._print_width // 2 to avoid the integer cast.

Note that we always use the Python 3 meaning of / in order to be able to use the same source code for both Python 2 and 3.

charris · 2015-12-01T22:55:20Z

Comment looks fine, but there is another nit to pick.

saimn · 2015-12-01T23:02:58Z

Good point, thanks !

ENH: Avoid memory peak and useless computations when printing a MaskedArray.

charris · 2015-12-01T23:18:53Z

Thanks Simon.

DOC: Add changelog for #6734 and #6748.

ahaldane · 2015-12-02T17:01:40Z

This PR is a great idea! (at least for now)

Possibly for a "real" fix (to avoid to conversion to object type) we need a new dtype system where we can override the __repr__ function of the dtype. Currently this is only possible for the np.void dtype.

* 'master' of git://github.com/numpy/numpy: (24 commits) BENCH: allow benchmark suite to run on Python 3 TST: test f2py, fallback on f2py2.7 etc., fixes numpy#6718 BUG: link cblas library if cblas is detected BUG/TST: Fix for numpy#6724, make numpy.ma.mvoid consistent with numpy.void BUG/TST: Fix numpy#6760 by correctly describing mask on nested subdtypes BUG: resizing empty array with complex dtype failed DOC: Add changelog for numpy#6734 and numpy#6748. Use integer division to avoid casting to int. Allow to change the maximum width with a class variable. Add some tests for mask creation with mask=True or False. Test that the mask dtype if MaskType before using np.zeros/ones BUG/TST: Fix for numpy#6729 ENH: Avoid memory peak and useless computations when printing a MaskedArray. ENH: Avoid memory peak when creating a MaskedArray with mask=True/False (numpy#6732). BUG: Readd fallback CBLAS detection on linux. TST: Fix travis-ci test for numpy wheels. MAINT: Localize variables only used with relaxed stride checking. BUG: Fix for numpy#6719 MAINT: enable Werror=vla in travis BUG: Include relevant files from numpy/linalg/lapack_lite in sdist. ...

Ref numpy#7621. numpy#6748 added `np.ma.MaskedArray._print_width` which is used to cut a masked array before printing it (to save memory and cpu time during the conversion to the object dtype). But this doesn't work correctly for 1D arrays, for which up to 1000 values can be printed before cutting the array. So this commit adds a new class variable `_print_width_1d` to handle the 1D case separately.

charris added 00 - Bug component: numpy.ma masked arrays labels Nov 30, 2015

Allow to change the maximum width with a class variable.

b5c456e

charris reviewed Dec 1, 2015
View reviewed changes

Use integer division to avoid casting to int.

d0e9d98

charris added a commit that referenced this pull request Dec 1, 2015

Merge pull request #6748 from saimn/ma-repr-memory

11f8092

ENH: Avoid memory peak and useless computations when printing a MaskedArray.

charris merged commit 11f8092 into numpy:master Dec 1, 2015

saimn deleted the ma-repr-memory branch December 2, 2015 08:40

saimn added a commit to saimn/numpy that referenced this pull request Dec 2, 2015

DOC: Add changelog for numpy#6734 and numpy#6748.

f752d84

charris added a commit that referenced this pull request Dec 2, 2015

Merge pull request #6754 from saimn/ma-changelog

45ff556

DOC: Add changelog for #6734 and #6748.

saimn mentioned this pull request Feb 4, 2016

BUG: str and repr special methods blow up memory usage #3544

Closed

jaimefrio pushed a commit to jaimefrio/numpy that referenced this pull request Mar 22, 2016

DOC: Add changelog for numpy#6734 and numpy#6748.

071b6de

charris mentioned this pull request May 11, 2016

BUG: error in printing masked arrays #7621

Closed

saimn mentioned this pull request May 22, 2016

BUG: fix incorrect printing of 1D masked arrays #7658

Merged

charris mentioned this pull request May 23, 2016

Backport 7658, BUG: fix incorrect printing of 1D masked arrays #7665

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ENH: Avoid memory peak and useless computations when printing a MaskedArray. #6748

ENH: Avoid memory peak and useless computations when printing a MaskedArray. #6748

Uh oh!

saimn commented Nov 30, 2015

Uh oh!

charris commented Dec 1, 2015

Uh oh!

seberg commented Dec 1, 2015

Uh oh!

seberg commented Dec 1, 2015

Uh oh!

saimn commented Dec 1, 2015

Uh oh!

charris commented Dec 1, 2015

Uh oh!

saimn commented Dec 1, 2015

Uh oh!

charris Dec 1, 2015

Uh oh!

charris Dec 1, 2015

Uh oh!

charris commented Dec 1, 2015

Uh oh!

saimn commented Dec 1, 2015

Uh oh!

charris commented Dec 1, 2015

Uh oh!

ahaldane commented Dec 2, 2015

Uh oh!

Uh oh!

Uh oh!

ENH: Avoid memory peak and useless computations when printing a MaskedArray. #6748

ENH: Avoid memory peak and useless computations when printing a MaskedArray. #6748

Uh oh!

Conversation

saimn commented Nov 30, 2015

Uh oh!

charris commented Dec 1, 2015

Uh oh!

seberg commented Dec 1, 2015

Uh oh!

seberg commented Dec 1, 2015

Uh oh!

saimn commented Dec 1, 2015

Uh oh!

charris commented Dec 1, 2015

Uh oh!

saimn commented Dec 1, 2015

Uh oh!

charris Dec 1, 2015

Choose a reason for hiding this comment

Uh oh!

charris Dec 1, 2015

Choose a reason for hiding this comment

Uh oh!

charris commented Dec 1, 2015

Uh oh!

saimn commented Dec 1, 2015

Uh oh!

charris commented Dec 1, 2015

Uh oh!

ahaldane commented Dec 2, 2015

Uh oh!

Uh oh!