Skip to content

MAINT: Speedup field access by removing unneeded safety checks #6208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 18, 2015

Conversation

ahaldane
Copy link
Member

#5548 causes a big slowdown in the specific case of accessing/assigning fields of structured arrays with large (eg 1kb) dtypes. Oops! I noticed this while investigating #1984. The problem is #5548 implemented an algorithm which checks view safety byte-by-byte (slow) under the assumption that most dtypes are only a small number of bytes. That's bad if the dtype is very large, as in the test script of #1984.

One fix is to speed the safety-check algorithm up, but it also turns out that the checks aren't needed in many cases. For example, during field indexing they aren't needed because we're merely viewing fields that already exist, which is therefore safe. So I've bypassed the safety checks by rewriting that indexing code (in C) so it avoids using PyArray_View.

I also made corresponding voidtype methods call the ndarray methods to get the same benefits. Incidentally this also enables some functionality that was missing from voidtype relative to ndarray. Eg, you can now do arr[0][['a', 'b']] and it will give you a more correct error message.

Safety checks are still important for get/setfield and for views. But even there we can skip the checks early if we know that the datatypes aren't structured or aren't of object type.

Timing info for the test script in #1984, run as ./test.py 10000

Numpy 1.9.2:

array,  B['a'][:] = A['a']      0.00852 seconds
array,  B['a']    = A['a']      0.00648 seconds
scalar, B['a'][:] = A['a']      0.016 seconds
scalar, B['a']    = A['a']      0.132 seconds

Master:

array,  B['a'][:] = A['a']      16.4 seconds
array,  B['a']    = A['a']      16.3 seconds
scalar, B['a'][:] = A['a']      16.1 seconds
scalar, B['a']    = A['a']      16.6 seconds

This PR:

array,  B['a'][:] = A['a']      0.00733 seconds
array,  B['a']    = A['a']      0.00673 seconds
scalar, B['a'][:] = A['a']      0.0171 seconds
scalar, B['a']    = A['a']      0.0151 seconds

Later I'll look into a faster algorithm for the structured safety checks.

I think this PR also fixes #1984, if 0.0151s is fast enough!

By the way, I think get/setfield are no longer used anywhere internally in numpy.

@ahaldane ahaldane force-pushed the fast_field_subscript branch 21 times, most recently from 4573b88 to 290213c Compare August 15, 2015 01:08
@seberg
Copy link
Member

seberg commented Aug 19, 2015

Do we have to consider this a 1.10 regression?

@ahaldane
Copy link
Member Author

I don't yet know enough about the dev process to answer that, but note it's probably a pretty rare problem:

The slowdown should only be noticeable if you are taking 1000s of views of an array with a huge dtype. I expect that the number of people using huge dtypes (eg, large subarrays) is pretty small, and then the number of people also taking 1000s of views with such a dtype is even smaller.

@seberg
Copy link
Member

seberg commented Aug 19, 2015

OK, good, then I assume it does not have priority, just wondered.

@ahaldane
Copy link
Member Author

I'll write up a short description of the code changes in a little bit, to help any reviewers.

@seberg
Copy link
Member

seberg commented Oct 13, 2015

Not sure you have to worry about it too much, back in the day I wanted to know if we need to make sure it makes it into 1.10. I thought it did not sound like it would be necessary (which apparently was not quite right). After that, never looked at it. If you think it makes reviewing much easier, go ahead I guess.

@pv
Copy link
Member

pv commented Oct 13, 2015

Maybe add a benchmark for it? pv@1dff685

@ahaldane ahaldane force-pushed the fast_field_subscript branch from 290213c to 1dff685 Compare October 13, 2015 18:04
@ahaldane
Copy link
Member Author

Great, I tried adding your commit. Hopefully that is the right way to do it.

@charris charris added this to the 1.10.2 release milestone Oct 13, 2015
""" Given a structured array and a sequence of field names
construct new array with just those fields.
def _copy_fields(ary):
"""Returns a copy of a structured array with padding/unknown bytes between
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the summary lines < 80 characters. Maybe

"""Return copy of structured array with unlabled bytes between fields removed.
"""

Or some such.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe even

"""Return copy of structured array with padding between fields removed.
"""

@charris
Copy link
Member

charris commented Oct 13, 2015

Do we have to consider this a 1.10 regression?

Looks like it ;)

This is a pretty extensive refactoring, so it makes me a bit nervous. Someone needs to do a deep review of the code, which I haven't managed yet. Did you ever profile the original to see just where the bottlenecks were?

@ahaldane
Copy link
Member Author

I don't think profiling is needed: I know the view safety checks are slow for large datatypes, and I know they are used when indexing fields. So the solution is to (safely) skip the checks here.

@ahaldane
Copy link
Member Author

I am also not against reverting #5548. I knew that the algorithm there was slow, but didn't expect it to be such a problem. Given the extent of the changes needed here, I'm thinking #5548 needs to be given more thought - eg maybe the alternate approach I discussed but decided against would avoid this slowdown.

I just tried reverting it on my local setup, there does not seem to be any problem in doing so. (I just needed one extra small change to account for extra safety features I introduced elsewhere.) Reverting is the easy fix to these speed issues, and I could reintroduce a better version of #5548 for 1.11 or later. Any opinions?

@charris
Copy link
Member

charris commented Oct 18, 2015

@ahaldane If you fix the nit I'll put this in so it can get some testing.

@charris
Copy link
Member

charris commented Oct 18, 2015

Reversion would be the easy fix. I think it comes down to whether of not you would like to take time to rethink the issue and put up a cleansheet solution or continue to pursue the current fixes.

@ahaldane ahaldane force-pushed the fast_field_subscript branch from 1dff685 to 16a2ab1 Compare October 18, 2015 17:42
ahaldane and others added 4 commits October 18, 2015 15:15
Bypass unneeded "view" safety-checks in `array_subscript` and
`array_assign_subscript`, by avoiding use of `PyArray_View`.
Bypass unneeded "view" safety checks in voidtype_ subscript/assignment
methods, by falling back to ndarray methods which skip the checks.
Skip safety-checks in views as long as neither old or new dtypes of view
may have objects.
@ahaldane ahaldane force-pushed the fast_field_subscript branch from 16a2ab1 to 8cf5b50 Compare October 18, 2015 19:21
@ahaldane
Copy link
Member Author

Looking over it all again (and adding a minor tweak), I think merging this is better than reverting #5548. #5548 is probably the right solution, at least until a new dtype system comes out. If not, it's pretty easy to back out of later, and it only introduces a performance problem.

If you do merge, I've left #6467 open so we can wait for people to report back that their particular case is fixed.

@charris
Copy link
Member

charris commented Oct 18, 2015

OK, let's give it a shot. Thanks Allan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

100 times poorer performance indexing scalar record array (Trac #1386)
4 participants