Skip to content

ENH: Added axis param for np.count_nonzero #7138

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from
Closed

ENH: Added axis param for np.count_nonzero #7138

wants to merge 0 commits into from

Conversation

gfyoung
Copy link
Contributor

@gfyoung gfyoung commented Jan 29, 2016

Addresses feature request in #391 to add an axis parameter to np.count_nonzero.

@jaimefrio (from the discussion, it seems you had done some work on it already?)

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 29, 2016

Below is some timing information

On master:

> python -m timeit -n 100 "import numpy as np; m=np.random.randint(-1, 2, size=(10, 10, 10, 10, 10, 10));np.count_nonzero(m)"
100 loops, best of 3: 32.9 msec per loop

> python -m timeit -n 100 "import numpy as np; m=np.random.randint(-1, 2, size=(10, 10, 10, 10, 10, 10));(m != 0).sum(axis=None)"
100 loops, best of 3: 26 msec per loop

> python -m timeit -n 100 "import numpy as np; m=np.random.randint(-1, 2, size=(10, 10, 10, 10, 10, 10));(m != 0).sum(axis=5)"
100 loops, best of 3: 29.4 msec per loop

On PR:

> python -m timeit -n 100 "import numpy as np; m=np.random.randint(-1, 2, size=(10, 10, 10, 10, 10, 10));np.count_nonzero(m)"
100 loops, best of 3: 18.6 msec per loop

> python -m timeit -n 100 "import numpy as np; m=np.random.randint(-1, 2, size=(10, 10, 10, 10, 10, 10));(m != 0).sum(axis=5)"
100 loops, best of 3: 21.2 msec per loop

As can be observed, these changes cause a noticeable speed-up in the cases where dtype allows for good behavior along an axis.

array([2, 3])

"""
if not isinstance(a, ndarray):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as a = np.asanyarray(a), or a = np.asarray(a) if you do not want to let subclasses through.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which part? I'm not sure I follow with what you boxed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.asanyarray checks if its argument is an array (or a subclass), and returns it unchanged if it is. If it is not, it tries to convert its input to an array. So instead of doing the check yourself and then calling np.array, you can unconditionally call a = np.asanyarray(a).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that is true, and it is more compact. Done.

@juliantaylor
Copy link
Contributor

Unfortunately not really what we want.
The reason we want the axis keyword on count_nonzero is that it a lot faster on boolean array than sum.
This probably implies converting the C code to use the NDIter and figuring out a way to avoid the huge startup overhead of that structure (as count_nonzero is often used as a fast alternative to all on small arrays)

An alternative would be to figure out a way to use the count_nonzero code in the add.reduce ufunc (+ overhead avoiding)

As those two options currently look relatively far away maybe as a middle ground the code could call into the flattened version when possible and use sum like here when not.

@juliantaylor
Copy link
Contributor

also please try to keep pep8 changes and new code in separate commits (git add -p helps here)

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 29, 2016

@juliantaylor : Yeah...I had a lot of issues incorporating the axis parameter into the C code, which is why I went with the solution I went with. I agree that a more C-like solution should be found, but at least this PR gets the axis parameter into the code-base. How should I proceed exactly in your opinion?

@juliantaylor
Copy link
Contributor

probably adding something along the lines of this could work:

if axis is None and a.dtype == bool:
    return multiarray.core.count_nonzero(a.ravel())

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 29, 2016

How about this?

if a.dtype.name == 'bool':
    if axis is None:
        # flattening should be done for us
        return multiarray.count_nonzero(a)

    return a.sum(axis=axis)


return a.sum(axis=axis)

nonzeros = (a != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would not work for object arrays. You need to use bool(...) logic there. Though I admit that the current inner loop func is bugged, since it should be checking for errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I quite follow your comment here. Could you elaborate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"asdf" != 0, but bool("asdf") evaluates True of course, and so it is counted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"asdf" != 0 also evaluates True, right? I'm not sure I follow the point your explanation is trying to make.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, starting to be unconcentrated, but then "" != 0 as well, and bool("") evaluates False, same thing. Also you are not limited to strings, does not make it better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay, I see what you mean. Empty strings, arrays, etc. should be counted as zero's. So it really isn't counting zero's then? It's more counting how many False's we get when iterating over "elements" of the array?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is counting "nonzero", which for objects maps to obj.__nonzero__(), which is the mechanism behind bool(obj). bool(np.array([1, 2])) raises an error (unlike lists), and this error seems unhandled, and should probably raise even here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At line 432, you mean? If that's the case, I think backwards compatibility gets broken then. Perhaps maybe I should divide logic between whether axis is provided or not like this:

if axis is None:
...optimize if possible but call multiarray.count_nonzero if possible
else:
...messy logic here

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 30, 2016

Rewrote the function so that we don't have to use element-wise comparisons.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 30, 2016

This is just a "restart" if you may. I will also attempt to optimize based on dtype since element-wise comparisons work well for numbers for example.

elif issubdtype(a.dtype, str):
a = (a != '')

if a.dtype.name == 'bool':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use

a.dtype == bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing. Not sure why I keep NOT changing that.

@juliantaylor
Copy link
Contributor

don't know if it is very useful, but as this puts a ufunc like interface on top of count_nonzero it should also support the out and keepdims arguments, for the latter there is a helper function ´_ureduce, see e.g.median`

an issue is that it then has a ufunc like interface but without the dtype argument which messes up the ordering. Maybe making them keyword only would help?
Then there are the more advanced ufunc arguments like where= though I think those might already be keyword only so they could be skipped for now.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 30, 2016

@juliantaylor : The issue is that the interface is only applied in certain cases, not to mention it adds additional complexity to a function that has enough issues as it is. I'm not sure I want to tackle that in this PR given all those considerations.

@juliantaylor
Copy link
Contributor

I guess you know see why it didn't exist up to now, adding new functions needs a lot of consideration, there is only one chance at getting it right, after release it cannot be changed anymore.

but I accept it might be a bit much for this PR and can be added later, but a ticket with a milestone needs to be created.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 30, 2016

I do concede that changing an API after release is a lot more difficult than beforehand, though it is something I'm not all too unfamiliar with given my other PR's. 😄

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 30, 2016

Rewrote some of the important optimizations used in np.count_nonzero as well as removed some unnecessary array creation overhead in arr_count_nonzero since we would now always pass it an ndarray object with the new implementation. Travis and Appveyor seem to be happy with this as well.

# multiarray is faster with
# booleans when axis is None
if a.dtype == bool:
return multiarray.count_nonzero(a) if axis is None \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backslashes are not very PEP8 compliant, use parentheses:

return (multiarray.count_nonzero(a) is axis is None
        else a.sum(axis=axis, dtype=intp))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@jaimefrio
Copy link
Member

Is the increased memory use of always creating a boolean array before counting a concern? Because if it is not it should be relatively simple to rewrite all of this, including multiple axes, in C without needing to use the iterator, just relying on operating on a newly created contiguous array.

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 31, 2016

I am not aware of any such issues ATM. I do agree it certainly is preferable to write the code in C (more consistent with other multiarray API methods + faster). However, as I had issues doing this, your guidance on such a rewrite would definitely be appreciated. I know you had said in the discussion in #391 that you had gone on some sort of axis-adding spree that included count_nonzero, but I failed to find the branch with the code. Perhaps that could be a start?

@jaimefrio
Copy link
Member

There's a stalled PR, #4330, that has a working implementation of an axis argument for np.bincount, with all the bells and whistles, including a fast path to avoid using the iterator when not strictly necessary. It is a lot of code, and figuring a way of doing these type of things without bloating the code base so much is something we haven't figured out yet. But if you have trouble sleeping, you can take a look!

@gfyoung
Copy link
Contributor Author

gfyoung commented Jan 31, 2016

Alrighty! Sounds good. I'll have a look, and if I can't make much sense of it, I'll let you know. IMHO, this PR is probably ready to merge in its current form because it adds the axis parameter and already optimizes for several of what should be the most common usages.

@gfyoung gfyoung closed this Feb 3, 2016
@gfyoung gfyoung deleted the count_nonzero_axis branch February 3, 2016 04:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants