Skip to content

Add where= parameter to ufuncs #99

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 24 commits into from
Closed

Conversation

mwiebe
Copy link
Member

@mwiebe mwiebe commented Jun 29, 2011

Also made a standard mechanism for inner loop auxiliary data, converting the existing low level casting loops to use it. I've also moved the ufunc type promotion functions into their own file to tame the ufunc_object.c monster a tiny bit.

A simple test, mainly showing how poorly boolean indexing performs:

In [24]: a = np.ones((1000,1000))
In [25]: b = np.ones((1000,1000))
In [26]: c = np.zeros((1000, 1000))
In [27]: m = np.random.rand(1000,1000) > 0.5

In [28]: timeit c[m] = a[m] + b[m]
1 loops, best of 3: 246 ms per loop

In [29]: timeit np.add(a, b, out=c, where=m)
10 loops, best of 3: 20.3 ms per loop

In [30]: timeit np.add(a, b, out=c)
100 loops, best of 3: 12.4 ms per loop

@dhomeier
Copy link
Contributor

Impressive speedup indeed, also getting 10-12x improvement for all but transcendental operations (np.power(a, b)...).
I did notice an instability not directly related (found at last it also is present in 1.6's np.add): If you pass an output array with a inconsistent dtype, like


> > > b = np.ones((1000,1000), dtype=np.int64)
> > > c = np.zeros((1000,1000))
> > > np.add(a, b, out=c)
> > > Bus error `
> > > or alternatively
> > > `>>> a = np.ones((1000,1000), dtype=np.int32)
> > > b = np.ones((1000,1000), dtype=np.int32)
> > > c = np.zeros((1000,1000))
> > > np.add(a, b, out=c)
> > > python2.7(96832) malloc: **\* error for object 0x100ae4008: incorrect checksum for freed object - object was probably modified after being freed.
> > > **\* set a breakpoint in malloc_error_break to debug
> > > Abort trap```
> > > I don't know if it would be possible to catch such incorrect input, but maybe you could have a look at it while you are working at this...

@mwiebe
Copy link
Member Author

mwiebe commented Jun 29, 2011

Thanks. The crashes you're getting definitely shouldn't be happening (NumPy should never crash!), it should automatically convert the data (raising an exception if it's violating the casting rule).

@dhomeier
Copy link
Contributor

Since this is the missing data branch, I am also reporting some Python3-errors with tests updated from your previous masked array commits:

.
Traceback (most recent call last):
  File "/Users/derek/lib/python3.2/site-packages/numpy/ma/tests/test_core.py", line 1703, in test_inplace_division_scalar_int
    x /= 2
  File "/Users/derek/lib/python3.2/site-packages/numpy/ma/core.py", line 3755, in **itruediv**
    ndarray.**itruediv**(self._data, np.where(self._mask, 1, other_data))
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''
.
ERROR: Test of inplace operations and rich comparisons
.
Traceback (most recent call last):
  File "/Users/derek/lib/python3.2/site-packages/numpy/ma/tests/test_old_ma.py", line 489, in test_testInplace
    x /= 2
  File "/Users/derek/lib/python3.2/site-packages/numpy/ma/core.py", line 3755, in __itruediv__
    ndarray.**itruediv**(self._data, np.where(self._mask, 1, other_data))
TypeError: ufunc 'true_divide' output (typecode 'd') could not be coerced to provided output parameter (typecode 'l') according to the casting rule ''same_kind''

  • all due to usingx / n for integer division. I'd attach the patch here if it was possible (just 4 changes to xm //= 2 in the tests) - let me know if I should put it on Trac instead.

@mwiebe
Copy link
Member Author

mwiebe commented Jun 29, 2011

I've patched those integer divisions.

@charris
Copy link
Member

charris commented Jun 30, 2011

Bit more on the first segfault (64 bit linux)

#0  LONG_add (args=<optimized out>, dimensions=<optimized out>, 
    steps=<optimized out>, __NPY_UNUSED_TAGGEDfunc=<optimized out>)
    at numpy/core/src/umath/loops.c.src:724
#1  0x00007ffff09c6360 in iterator_loop (self=<optimized out>, 
    op=<optimized out>, dtype=<optimized out>, order=<optimized out>, 
    buffersize=<optimized out>, arr_prep=<optimized out>, arr_prep_args=0x0, 
    innerloop=0x7ffff09d6000 <LONG_add>, innerloopdata=0x0)
    at numpy/core/src/umath/ufunc_object.c:1273
#2  0x00007ffff09dd7f2 in execute_ufunc_loop (innerloopdata=0x0, innerloop=
    0x7ffff09d6000 <LONG_add>, arr_prep_args=0x0, arr_prep=0x7fffffffcd70, 
    buffersize=8192, order=NPY_KEEPORDER, dtype=0x7fffffffcc70, op=
    0x7fffffffd410, trivial_loop_ok=0, self=0x8adf80)
    at numpy/core/src/umath/ufunc_object.c:1412
#3  PyUFunc_GenericFunction (self=0x8adf80, args=
    (<numpy.ndarray at remote 0xe26c40>, <numpy.ndarray at remote 0xe9a0c0>), 
    kwds={'out': <numpy.ndarray at remote 0xe9c820>}, op=0x7fffffffd410)
    at numpy/core/src/umath/ufunc_object.c:2228
#4  0x00007ffff09de556 in ufunc_generic_call (self=0x8adf80, args=
    (<numpy.ndarray at remote 0xe26c40>, <numpy.ndarray at remote 0xe9a0c0>), 
    kwds={'out': <numpy.ndarray at remote 0xe9c820>})
    at numpy/core/src/umath/ufunc_object.c:3720
#5  0x0000003e42c48fe3 in PyObject_Call (func=
    <numpy.ufunc at remote 0x8adf80>, arg=<optimized out>, kw=<optimized out>)

@mwiebe
Copy link
Member Author

mwiebe commented Jun 30, 2011

Great, I think it would be good to put the crash info in its own trac ticket, since it's independent of this pull request. I can probably get to it some time after this is merged.

@mwiebe
Copy link
Member Author

mwiebe commented Jun 30, 2011

I've added some commits to deprecate some bad namespace pollution in the macros. I've started with some obviously bad C API stuff.

@@ -291,8 +352,8 @@ New functions added to the ndarray are::
array is unmasked and has the 'NA' part stripped from the
parameterized type ('NA[f8]' becomes just 'f8').

arr.view(masked=True)
This is a shortcut for 'a = arr.view(); a.flags.hasmask=True'.
arr.view(namasked=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking 'asnamasked' would be more descriptive, but it's getting a bit long.

@charris
Copy link
Member

charris commented Jul 1, 2011

Where might be more useful if it didn't overwrite the masked portions of the output.

In [43]: a = ones(10, int64)

In [44]: b = ones(10, int64)

In [45]: c = ones(10, float64)

In [46]: add(a,b,out=c, where=a>b)
Out[46]: 
array([  2.66916676e+11,   2.66916676e+11,   1.39708543e+14,
         0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
         0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
         0.00000000e+00])

Hmm, that seems to be part of the mixed type problem.

In [52]: c = ones(10, int64)

In [53]: add(a,b,out=c, where=a>b)
Out[53]: array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

@mwiebe
Copy link
Member Author

mwiebe commented Jul 1, 2011

I think it would be good to document the API deprecations before merging, but otherwise hopefully it's good to go? Thanks for the reviews!

@charris
Copy link
Member

charris commented Jul 1, 2011

I'm concerned about committing when there is a known segfault, otherwise I would have put it in last night. The problem looks to be in the source pointer for the copy to the output. It works fine without the where. Thoughts?

@mwiebe
Copy link
Member Author

mwiebe commented Jul 1, 2011

I haven't tried it, but I read Derek's comment to mean that this bug exists all the way back to 1.6, so this patch wasn't changing anything about it. I suppose I could just track it down first, sure.

@charris
Copy link
Member

charris commented Jul 1, 2011

Hmm, you're right. There may actually be two errors, or maybe some combination of the input being freed at different times. I was using small arrays for testing in the hope of avoiding the segfault, but I do see it with large arrays. However, the output is wrong in the small array (no segfault) case with mixed kinds and the where keyword.

@charris
Copy link
Member

charris commented Jul 2, 2011

Here is what I am talking about

In [14]: a = zeros(100, dtype=int64)

In [15]: c = zeros(100, dtype=float64)

In [16]: add(a, a, out=c, where=(arange(100)<5))
Out[16]: 
array([  0.,   0.,   0.,   0.,   0.,   5.,   6.,   7.,   8.,   9.,  10.,
    11.,  12.,  13.,  14.,  15.,  16.,  17.,  18.,  19.,  20.,  21.,
    22.,  23.,  24.,  25.,  26.,  27.,  28.,  29.,  30.,  31.,  32.,
    33.,  34.,  35.,  36.,  37.,  38.,  39.,  40.,  41.,  42.,  43.,
    44.,  45.,  46.,  47.,  48.,  49.,  50.,  51.,  52.,  53.,  54.,
    55.,  56.,  57.,  58.,  59.,  60.,  61.,  62.,  63.,  64.,  65.,
    66.,  67.,  68.,  69.,  70.,  71.,  72.,  73.,  74.,  75.,  76.,
    77.,  78.,  79.,  80.,  81.,  82.,  83.,  84.,  85.,  86.,  87.,
    88.,  89.,  90.,  91.,  92.,  93.,  94.,  95.,  96.,  97.,  98.,
    99.])

This behavior is actually different if the arrays are smaller. And if the output is int64

In [28]: a = zeros(100, dtype=int64)

In [29]: c = zeros(100, dtype=int64)

In [30]: add(a, a, out=c, where=(arange(100)<5))
Out[30]: 
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
   0, 0, 0, 0, 0, 0, 0, 0])

@charris
Copy link
Member

charris commented Jul 2, 2011

Playing with the bug{s}, there appears to be no problem with unary ufuncs.

proposed elsewhere for customizing subclass ufunc behavior with a
_numpy_ufunc_ member function would allow a subclass with a different
default to be created.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this would be the approach where numpy.ma can be implemented to maintain backwards compatibility?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That should mostly be possible, yes. For the subtleties and features that the missing data abstraction doesn't provide, I'm also trying to make working with masks easier in general with things like the 'where=' parameter in ufuncs.

@WeatherGod
Copy link
Contributor

I just tried a clean build of this pull request on my 32-bit, python 2.7 platform. Somewhere during the tests, a glibc exception occurs with respect to a double-freeing of memory.

@mwiebe
Copy link
Member Author

mwiebe commented Jul 5, 2011

Sorry about the test error, I added tests for bugs that Derek and Chuck reported but haven't implemented fixes for them yet.

@charris
Copy link
Member

charris commented Jul 6, 2011

Hi Mark,

This is getting rather large. Would you mind if I committed the NEP and header cleanups separately and then you could rebase the remainder on top of that?

@mwiebe
Copy link
Member Author

mwiebe commented Jul 6, 2011

Sure, go for it!

@charris
Copy link
Member

charris commented Jul 6, 2011

Never mind, it isn't that easy to tease things apart ;)

@charris
Copy link
Member

charris commented Jul 6, 2011

Left out a cherry-pick, it works now. I'll put it up as a pull request.

Mark Wiebe added 18 commits July 6, 2011 17:06
…ends

This is an attempt to reduce potential confusion between the existing
numpy.ma and the NA-supporting built-in mask.
This one handles PyArray_DEFAULT -> NPY_DEFAULT_TYPE and
the NPY_* array flags -> NPY_ARRAY_*. The PyArray_DEFAULT vs NPY_DEFAULT
confusion was particularly egregious here.
… masks'

This includes an email comment from Ben about 'np.any' and 'np.all'.
This is a step towards having everyone on the list use the same
vocabulary with specific nailed-down definitions for the terms.
Will reenable it once masking iteration features are done.
@mwiebe
Copy link
Member Author

mwiebe commented Jul 6, 2011

The bug Derek found is fixed in master now. The bug Chuck found requires implementing mask features in the iterator to properly handle it, so I'd like to merge this as is with the bug still there. I've marked it as known fail, and will unmark it when it can be fixed.

@charris
Copy link
Member

charris commented Jul 6, 2011

Sounds good, I put it in.

@charris charris closed this Jul 6, 2011
@dhomeier
Copy link
Contributor

dhomeier commented Jul 7, 2011

Thanks for working on this, numpy no longer crashes, and all combinations with different in and out dtypes also seem to work for the standard function call (or raise the appropriate TypeError)! However, with your where parameter np.add() is now returning wrong results [Python2.7, Mac OS 10.6]:


> > > b = np.ones((1000,1000), dtype=np.int64)
> > > c = np.zeros((1000,1000), dtype=np.float64)
> > > d = np.add(a, b, out=c, where=m)
> > > a.max(), b.max(), c.max(), d.max()
> > > (1, 1, 8.0, 8.0)
> > > c = np.zeros((1000,1000), dtype=np.float32)
> > > d = np.add(a, b, out=c, where=m)
> > > a.max(), b.max(), c.max(), d.max()
> > > (1, 1, 4.6071824e+18, 4.6071824e+18)```

@mwiebe
Copy link
Member Author

mwiebe commented Jul 7, 2011

Cool, thanks for confirming all that. I'm presently extending the iterator to have a masked mode which is needed to properly fix that bug. There's a test in the suite which is marked knownfail at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants