Skip to content

BUG: consider shipping numpy 2 final (or rc2) with OpenBLAS 0.3.27 #26240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ogrisel opened this issue Apr 9, 2024 · 8 comments · Fixed by #26258
Closed

BUG: consider shipping numpy 2 final (or rc2) with OpenBLAS 0.3.27 #26240

ogrisel opened this issue Apr 9, 2024 · 8 comments · Fixed by #26258
Milestone

Comments

@ogrisel
Copy link
Contributor

ogrisel commented Apr 9, 2024

NumPy 2.0.0rc1 currently ships OpenBLAS 0.3.26 that has a thread-safety bug on Windows:

This was fixed in OpenBLAS 0.3.27 but is not yet packaged on pypi.org:

This bug is causing the scikit-learn release CI to freeze when testing against numpy 2.0.0rc1.

@ogrisel
Copy link
Contributor Author

ogrisel commented Apr 9, 2024

Note that we also discovered that python -m threadpooctl -i numpy fails to detect the openblas lib linked to numpy because numpy now ships a dynamic library named "libscipy_openblas64_-xxxxxx.so" instead of the usual "libopenblas64_-xxxxxx.so" (with similarly renamed symbols).

We will update threadpoolctl to accept this naming pattern.

@seberg seberg added this to the 2.0.0 release milestone Apr 9, 2024
@rgommers
Copy link
Member

rgommers commented Apr 9, 2024

Good idea, I think we should update to 0.3.27 for rc2 indeed.

We will update threadpoolctl to accept this naming pattern.

That's fine as a band-aid, but let's do something a bit more future-proof after that. We should be able to rename the shared library again without breaking threadpoolctl. So let's make this simpler and do something like adding a private but tested API that threadpoolctl can query. Something like numpy._has_vendored_openblas(), or something more general.

@ogrisel
Copy link
Contributor Author

ogrisel commented Apr 9, 2024

Here is the PR for threadpoolctl:

joblib/threadpoolctl#175

Note: threadpoolctl now has a pluggable controller system, so numpy could ship its own library backend:

https://github.com/joblib/threadpoolctl?tab=readme-ov-file#writing-a-custom-library-controller

@ogrisel
Copy link
Contributor Author

ogrisel commented Apr 9, 2024

BTW, is there an issue somewhere that gives motivation for the filename / symbol renaming? If found #25505 that upgrades the numpy build system to download the library from pypi.org but as far as I understand, both numpy and scipy still ship their own copies of openblas, scipy-openblas32/scipy-openblas64 is not meant to be a shared runtime dependency but rather a pre-built artifact vendored as build time by each project.

EDIT: I guess this is a partial step towards the plan described in the description of #24857.

@mattip
Copy link
Member

mattip commented Apr 9, 2024

The plan is to get SciPy to a point where it can use the same wheel at build time as NumPy. The decision to prefix the symbols with scipy_ was made when we thought we could use the wheel as a runtime dependency, which would mean the symbols are loaded globally. Using the runtime wheel is waiting for improvements in the sdist/binary compatibility: there is currently no way to specify that scipy_openblas is a runtime dependency of the binary NumPy wheels we build without also specifying it as a dependency of the source distribution.

@rgommers
Copy link
Member

Note: threadpoolctl now has a pluggable controller system, so numpy could ship its own library backend:

NumPy itself (excluding BLAS calls) is always single-threaded. So this is probably not useful? We can't deal with the threading behavior of any underlying BLAS library, that's already in the threadpoolctl BLAS support and pretty complex. So my impression is still that all we need here is a more reliable way to tell threadpoolctl if we're using a vendored libopenblas.

The decision to prefix the symbols with scipy_ was made when we thought we could use the wheel as a runtime dependency,

To add to that: when trying the initial non-prefixed library, we very quickly ran into symbol name conflicts. For example, using numpy from conda-forge (which will rely on libblas with default symbol names) and then building scipy against scipy-openblas32 will yield conflicts between the two BLAS libraries.

See scipy/scipy#19381 for more detail.

is not meant to be a shared runtime dependency but rather a pre-built artifact vendored as build time by each project.

As Matti said, it is meant for that. Packaging standards/assumptions are getting in the way though. We are actively discussing that on the packaging Discourse right now - here is the most relevant explanation of the issue: https://discuss.python.org/t/enforcing-consistent-metadata-for-packages/50008/32 (and I need to reply to that thread again asap).

@ogrisel
Copy link
Contributor Author

ogrisel commented Apr 10, 2024

NumPy itself (excluding BLAS calls) is always single-threaded. So this is probably not useful? We can't deal with the threading behavior of any underlying BLAS library, that's already in the threadpoolctl BLAS support and pretty complex. So my impression is still that all we need here is a more reliable way to tell threadpoolctl if we're using a vendored libopenblas.

I meant that if numpy wants to take control of the way threadpoolctl accesses the OpenBLAS vendored by numpy, it is possible for numpy to register a dedicated OpenBLAS controller into threadpoolctl (e.g. at numpy import time). I do not necessarily recommend it, but it's technically possible. However, I think I would rather evolve the default openblas controller provided in threadpoolctl by following the flexible naming approach you suggested in joblib/threadpoolctl#175 (comment).

@rgommers
Copy link
Member

Thanks for explaining, makes sense now. And I agree with your assessment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants