Skip to content

PERF: improve multithreaded ufunc scaling #27913

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 5, 2024

Conversation

charris
Copy link
Member

@charris charris commented Dec 5, 2024

Backport of #27896.

Fixes #27786 and re-do of #27859.

Thanks to @seiko2plus for the suggestion to convert this code to C++ and use C++17 features. Let's see if any of the platforms we run CI on object to using this, it passes the tests on my Macbook Pro.

There is an earlier commit included in this PR that keeps things in C and uses the CPython-internal _PyRWMutex, which also seems to work, but using C++ lets us avoid the need to write a PEP or make an argument to the C API workgroup that making _PyRWMutex public would be nice for C extensions.

With the test script in the issue, I see the following scaling using std::shared_mutex:

mflops_array_length_1000

  • PERF: add a fast path to ufunc type resolution

  • MAINT: move dispatching.c to C++

  • MAINT: move npy_hashtable to C++ and use std::shared_mutex

  • MAINT: fix windows linking

  • MAINT: remove outdated comment

  • MAINT: only call try_promote on free-threaded build

Converts dispatching to cpp in order to use std::shared_mutex to improve free-threaded scaling.

  • MAINT: try to give new function a name indicating it uses a mutex

  • MAINT: only do complicated casting to get a mutex pointer once

  • MAINT: use std::nothrow to avoid dealing with exceptions

  • DOC: add changelog

* PERF: add a fast path to ufunc type resolution

* MAINT: move dispatching.c to C++

* MAINT: move npy_hashtable to C++ and use std::shared_mutex

* MAINT: fix windows linking

* MAINT: remove outdated comment

* MAINT: only call try_promote on free-threaded build

Converts dispatching to cpp in order to use `std::shared_mutex` to improve free-threaded scaling.

* MAINT: try to give new function a name indicating it uses a mutex

* MAINT: only do complicated casting to get a mutex pointer once

* MAINT: use std::nothrow to avoid dealing with exceptions

* DOC: add changelog
@charris charris added 03 - Maintenance 08 - Backport Used to tag backport PRs 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) labels Dec 5, 2024
@charris charris added this to the 2.2.0 release milestone Dec 5, 2024
@charris charris merged commit 7895ba6 into numpy:maintenance/2.2.x Dec 5, 2024
68 checks passed
@charris charris deleted the backport-27896 branch December 5, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 - Maintenance 08 - Backport Used to tag backport PRs 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants