PERF: improve multithreaded ufunc scaling #27913
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #27896.
Fixes #27786 and re-do of #27859.
Thanks to @seiko2plus for the suggestion to convert this code to C++ and use C++17 features. Let's see if any of the platforms we run CI on object to using this, it passes the tests on my Macbook Pro.
There is an earlier commit included in this PR that keeps things in C and uses the CPython-internal
_PyRWMutex
, which also seems to work, but using C++ lets us avoid the need to write a PEP or make an argument to the C API workgroup that making_PyRWMutex
public would be nice for C extensions.With the test script in the issue, I see the following scaling using
std::shared_mutex
:PERF: add a fast path to ufunc type resolution
MAINT: move dispatching.c to C++
MAINT: move npy_hashtable to C++ and use std::shared_mutex
MAINT: fix windows linking
MAINT: remove outdated comment
MAINT: only call try_promote on free-threaded build
Converts dispatching to cpp in order to use
std::shared_mutex
to improve free-threaded scaling.MAINT: try to give new function a name indicating it uses a mutex
MAINT: only do complicated casting to get a mutex pointer once
MAINT: use std::nothrow to avoid dealing with exceptions
DOC: add changelog