Skip to content

MAINT: Improve speed of ufunc kwargs parsing. #11333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

mhvk
Copy link
Contributor

@mhvk mhvk commented Jun 14, 2018

This uses the realization from inspecting similar CPython code that as we intern string names anyway, comparing keys with possible names by pointer will generally just work, and is of course much faster than first converting a unicode object to char* and then using strcmp.

@eric-wieser - my C remains limited: is there a cleverer way than the enum I used, ideally one that allows on to type things just once (there are now five places one has to keep up to date, three for the interns, one enum, and one array of the interned strings; this seems too much!)

Reduces the overhead as follows:

import numpy as np
a = np.array(1)
b = np.empty_like(a)
%timeit np.positive(a)                                 # 352->348 ns
%timeit np.positive(a, subok=True)                     # 606->501 ns
%timeit np.positive(a, where=True)                     # 586->503 ns
%timeit np.positive(a, where=True, subok=True)         # 695->531 ns
%timeit np.positive(a, b)                              # 354->352 ns
%timeit np.positive(a, out=b)                          # 557->480 ns
%timeit np.positive(a, out=b, subok=True)              # 668->506 ns
%timeit np.positive(a, out=b, where=True, subok=True)  # 752->536 ns

Using the realization from inspecting similar CPython code that as we
intern string names anyway, comparing keys with possible names by
pointer will generally just work.

Reduces the overhead as follows:
```
import numpy as np
a = np.array(1)
b = np.empty_like(a)
%timeit np.positive(a)                                 # 352->348 ns
%timeit np.positive(a, subok=True)                     # 606->501 ns
%timeit np.positive(a, where=True)                     # 586->503 ns
%timeit np.positive(a, where=True, subok=True)         # 695->531 ns
%timeit np.positive(a, b)                              # 354->352 ns
%timeit np.positive(a, out=b)                          # 557->480 ns
%timeit np.positive(a, out=b, subok=True)              # 668->506 ns
%timeit np.positive(a, out=b, where=True, subok=True)  # 752->536 ns
```
@mhvk
Copy link
Contributor Author

mhvk commented Jun 14, 2018

p.s. Quite obviously, more to be gained by doing this elsewhere, and by looking for out just once instead of who-knows-how-many-times.

for (kw_id = 0; kw_id < ufunc_badkey; kw_id++) {
int cmp = PyObject_RichCompareBool(key, ufunc_kwnames[kw_id], Py_EQ);
if (cmp > 0) {
goto kw_found;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a goto, just make the above a helper function returning kw_id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed CPython's ceval quite exactly! But with a helper function, this may become useful for GenericReduction as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link to the code you're imitating?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s. Do think the helper function is a good idea, but was waiting with making changes as I hoped someone would have a better idea of how to handle those name arrays and the enum -- I like the names but not that multiple lists have to be maintained separately. Can one iterate over an enum?

@eric-wieser
Copy link
Member

I'm curious how much the switch statement buys us over a bunch of if/else ifs comparing PyObject * pointers to interned strings, which would eliminate the need for an enum.

@mhvk
Copy link
Contributor Author

mhvk commented Jun 15, 2018

The one issue would be that then the slow path becomes a whole set of goto statements... (and given that python has it, I do not quite dare to remove it...)

@mhvk
Copy link
Contributor Author

mhvk commented Jun 15, 2018

I think the real solution here would be to mimic PyArg_ParseTupleAndKeywords, with all those bits of code interpreting the various keywords properly passed on as functions. But that may be better done as another PR...

@mhvk
Copy link
Contributor Author

mhvk commented Jun 21, 2018

Closing in favour of #11351

@mhvk mhvk closed this Jun 21, 2018
@mhvk mhvk deleted the ufunc-use-interned-strings-in-parsing branch June 21, 2018 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants