-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
BLD: replace uses of openblas_support with openblas wheels [wheel build] #25505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks Matti. This looks about right; the failures on 32-bit Windows, musllinux and macOS arm64 are all the same (per platform) - you may want to fix those on a single wheel build job on your own fork before running the whole battery of builds here. |
Checkout https://github.com/numpy/numpy/wiki/Debugging-CI-guidelines for some suggestions on how to debug cirrus locally (won't help with macos arm 64 if you don't have your own Mac) |
Mess. The scipy-openblas wheels differ from the tarballs. The tarballs we currently use have, in usr/local/lib, symlinks like
But the real file I think we have to rework the scipy-openblas wheel build before we can use it here to
|
That seems like the right change to make. There's no need for names like |
00d1539
to
7a03e9c
Compare
Updated to use the newer 0.3.26.0.2 wheels with a single shared object |
Diff looks reasonable - just some wheel build jobs to make happy |
@andyfaff do you know why the cirrus build sequence for wheels first builds NumPy, then calls cibuildwheel to build it again? It is failing since the cibw_before_build.sh does not overwrite |
macos with python 3.11, 3.12 and pypy3.9 is failing but with python3.9 and 3.10 is passing. The problem seems to be that delocate is not seeing the |
It shouldn't do. Can you link to a build log that shows this. When I look at e.g. the cron job on cirrus I can't see the two builds. A quick reminder, there is a matrix of images that the wheel building occurs for. Once for all the pythons on sonoma, and once on monterey. The wheels on Sonoma should be using Accelerate for BLAS, those on Monterey are supposed to still use OpenBLAS. It's not commented why delocate is pinned in the cibw_before_build script. I'll have a look into these issues in the next few days. It's going to be a slow ping time for me because I'm busy at work this week. |
Here is a build log with 4 steps: clone, build, install_cibuildwheel, cibuildwheel.
Those builds are probable not setting numpy/tools/wheels/cibw_before_build.sh Lines 20 to 22 in d3345bb
which then triggers the use of OpenBLAS |
The naming of those build steps is a little cryptic. The first 'build' step is actually firing all the other steps off, not actually building numpy in it. It could be called something like 'initial setup' or something like that. I'm not sure why the cibw_before_build script is executed in that step, possibly cruft that can be cleaned up. Debugging all of this is probably better done locally, using the wiki instructions I wrote. |
The Accelerate jobs should not have OpenBLAS installed, and that was certainly working until recently. Maybe the requirements files reshuffle broke it? |
Should be fixed by #25763 |
Thanks I will try out these instructions without a VM on a mac machine. Or did you mean the containerized one? |
If you have a macos m series I would try https://github.com/numpy/numpy/wiki/Debugging-CI-guidelines#local-build-of-macos-wheels-using-cibuildwheel-in-a-macos-tart-vm with the entries that youve seen fail. |
d5c4981
to
fcc7ac6
Compare
Rebased on top of #25763 and added a fix for cibuildwheel not setting macos repair wheel environment variables as specified in the |
|
||
source $PROJECT_DIR/tools/wheels/gfortran_utils.sh | ||
install_gfortran | ||
pip install "delocate==0.10.4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reviewers: this whole stanza has been removed: gfortran will not be installed on macos. The scipy-openblas wheels do not need it. This means the build is faster , but also that we cannot test 2py in these builds and I think the tests can use the gfortran from the conda install.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but also that we cannot test 2py in these builds.
That's fine; those are way too slow and need their own CI jobs instead. There's already an open issue for that somewhere.
I think this is ready for review. Some follow up, that I will do together with fixes from review:
|
That seems very fast, it is an order of magnitude faster than the cirrus aarch64 builds and twice as fast as the linux x86_64 cp11 build. Maybe something is wrong? |
By default each cirrus Mac runner has several cores, so compilation and testing is fast. That seems reasonable timing. If I do a clean build of numpy (with Accelerate) the build step takes 17s on an M1 mac, the testing takes about 2.5 mins. |
One wheel build timed out in testing. Everything else passed. I don't think the failure is connected to this PR, but maybe? |
Rerunning the timed-out muslinux wheel build, it now passed. CI is green. |
A question before I review. The TLDR of the PR seems to be: install scipy_openblasX, then create a PKG_CONFIG_PATH with a scipy-openblas.pc file that is sourced from the scipy-openblasX wheel. It's probably been discussed before, but it's not clear to me why the build (i.e. meson) can't be scripted in such a way that if it knows it's going to be using the scipy-openblasX wheel to source openblas, then why can't it query the wheel to figure out where the pc file is, and where the shared library is? i.e. why does this PR have to jump through hoops to move the pc file into a PKG_CONFIG_PATH? e.g. on a mac the spin cmd for build appears to work as follows (won't have all the details correct here):
My point is that in step 2 the user doesn't have to fiddle about with setting PKG_CONFIG_PATH, or creating files, everything is taken care of behind the scenes. Why can't this happen in this PR, why the manual intervention by setting paths beforehand? |
- name: Build a wheel via an sdist | ||
env: | ||
PKG_CONFIG_PATH: ${{ github.workspace }}/.openblas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is set twice, probably because environment variables aren't persistent between steps? Would it be better to use GITHUB_ENV?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like you set here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
.github/workflows/linux_blas.yml
Outdated
@@ -72,26 +72,32 @@ jobs: | |||
python-version: '3.11' | |||
|
|||
- name: Install dependencies | |||
env: | |||
# Need to do this in every step since setup-python overrides PKG_CONFIG_PATH | |||
PKG_CONFIG_PATH: ${{ github.workspace }}/.openblas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Write to GITHUB_ENV to prevent having to set it repeatedly?
tools/wheels/cibw_before_build.sh
Outdated
PKG_CONFIG_PATH=$PROJECT_DIR/.openblas | ||
rm -rf $PKG_CONFIG_PATH | ||
mkdir -p $PKG_CONFIG_PATH | ||
python -m pip install scipy-openblas64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for maintenance branches how do we choose what version of scipy-openblas gets installed. e.g. for a certain maintenance branch we may want to stick with a certain (older) version, and bump when we do a major release. At the moment we just get the latest version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use the ci_requirements.txt file.
elif [[ $RUNNER_OS == "Windows" ]]; then | ||
# Install Openblas from scipy-openblas64 | ||
if [[ "$INSTALL_OPENBLAS" = "true" ]] ; then | ||
echo PKG_CONFIG_PATH $PKG_CONFIG_PATH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we have to install scipy-openblas here - understood. However, why is the rest needed? Can't the meson build process do the querying of scipy-openblas to figure out all the necessary info?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meson could run python and try to import scipy_openblas32, then try to import scipy_openblas64. If either of those succeed, it could use get_include()
and get_libary()
to add the proper flags to the build. @rgommers thoughts?
The cibw_before_build.sh script does exactly what Is that what you were thinking? |
Just to be more accurate: that is not the spin logic, it is logic inside Outside meson.build: meson tries very hard to not incorporate too much logic and magic inside it, and depends on the user to properly set up pkg-config or cmake files for dependency detection. Inside meson.build: there is some merit to not repeating the pkg-config file creation twice, and to eliminate the need to externally set a |
OK, so you're saying that the main purpose is to let delocate/auditwheel/delvewheel know where the libraries are? I don't know why they would need to be told that. For example, if I use
It tells me where the openblas library is! They should just go and grab it. Perhaps I'm being too nitpicky. What we have so far is obviously working, so perhaps we should merge instead of making things too magical. |
The tarball used by Edit: with that, I am prepared to rethink the work here and discuss alternatives. Edit2: I think you are looking at the results of a local build of scipy. When installing scipy into a virtual environment from a wheel it looks different:
|
Ah ok. Well then since it's green we may as well merge, modulo the things that I suggested that you think should be added. |
I pushed a fix commit:
|
Using |
Ci is green after the changes. |
Looks good to me. A good block of work, thanks @mattip |
before-build = "bash {project}/tools/wheels/cibw_before_build.sh {project}" | ||
# The build will use openblas64 everywhere, except on arm64 macOS >=14.0 (uses Accelerate) | ||
config-settings = "setup-args=-Duse-ilp64=true setup-args=-Dallow-noblas=false build-dir=build" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattip why did you remove use-ilp64
here? It looks like it switched all the wheels to LP64. Log extracts:
# manylinux x86-64:
Run-time dependency scipy-openblas found: YES 0.3.26
Message: BLAS symbol suffix:
# Accelerate:
Run-time dependency accelerate found: YES
Message: BLAS symbol suffix: $NEWLAPACK
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the macros to the pkg-config file here so it is not needed for the scipy-openblas build. I guess it is important to leave it in for diagnostics and for other blas builds though, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes indeed
No more tarballs or |
Use the openblas wheels instead of
tools/openblas_support.py
inThis will fix #25503 (unpin cibuildwheel)