Skip to content

MNT Use cibuildwheel to build the wheels #17921

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 163 commits into from
Nov 6, 2020

Conversation

alfaro96
Copy link
Member

Reference Issues/PRs

Closes MacPython/scikit-learn-wheels#27.
Closes MacPython/scikit-learn-wheels#61.

What does this implement/fix? Explain your changes.

This PR moves the scikit-learn wheel builder to the scikit-learn organization using the cibuildwheel package.

@alfaro96 alfaro96 marked this pull request as draft July 14, 2020 09:16
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @alfaro96 ! Are we going to also upload them somewhere from this same script?

@alfaro96
Copy link
Member Author

Thanks @alfaro96 ! Are we going to also upload them somewhere from this same script?

Thank you @adrinjalali for the quickness!

The idea is to add a step to upload the wheels and source distribution to the Anaconda repositories (scipy-wheels-nightly and scikit-learn-wheels-staging).

WDYT?

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this @alfaro96 !

Copy link
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't mean to approve this PR, and can't find the 'unapprove' button now :)

Edit: there used to be a "dismiss review" button before, but I can't find it anymore..

@alfaro96 alfaro96 requested review from rth and adrinjalali July 14, 2020 14:35
@alfaro96
Copy link
Member Author

alfaro96 commented Jul 14, 2020

Some tests are not passing while testing the wheels. I will investigate this issue and let you know why this is happening.

Nevertheless, it is pretty straightforward to build the wheels with cibuildwheel.

@alfaro96
Copy link
Member Author

alfaro96 commented Jul 14, 2020

It seems we are running into memory issues. For instance:

  • test_common_check_return_X_y[fetch_covtype-fetch_covtype]: MemoryError: Unable to allocate 244. MiB for an array with shape (581012, 55) and data type float64.
  • test_svc_ovr_tie_breaking[SVC]: MemoryError: Unable to allocate 22.9 MiB for an array with shape (1000000, 3) and data type float64.

@alfaro96
Copy link
Member Author

alfaro96 commented Nov 5, 2020

Hey @ogrisel,

I have finally found the right way to go (and, IMHO, the best and cleanest solution).

The cibuildwheel package provides the CIBW_REPAIR_WHEEL_COMMAND_WINDOWS environment variable to repair a Windows wheel (by default, not repaired). However, I have realized that, in our case, it would be interesting to use this command to embed the vcomp140.dll. In particular, I have used the unpack + vendor_vcomp140.py + pack sequence to repair the Windows wheel.

It is important to note that the command is run on each built wheel before testing it (between the build and install + test steps).

Let us wait for green in the CD, but I think that this solution is gonna work pretty well.

Thank you @ogrisel for your valuable comments!

@alfaro96
Copy link
Member Author

alfaro96 commented Nov 5, 2020

Moreover, I have added the wheels for Python 3.9.

@alfaro96
Copy link
Member Author

alfaro96 commented Nov 5, 2020

The generated wheels vendor vcomp140.dll:

unzip -l scikit_learn-0.24.dev0-cp36-cp36m-win32.whl  | grep vcomp
   150400  09-16-2020 12:16   sklearn/.libs/vcomp140.dll
unzip -l scikit_learn-0.24.dev0-cp37-cp37m-win32.whl  | grep vcomp
   150400  09-16-2020 12:16   sklearn/.libs/vcomp140.dll
unzip -l scikit_learn-0.24.dev0-cp38-cp38-win32.whl  | grep vcomp
   150400  09-16-2020 12:16   sklearn/.libs/vcomp140.dll
unzip -l scikit_learn-0.24.dev0-cp39-cp39-win32.whl  | grep vcomp
   150400  09-16-2020 12:16   sklearn/.libs/vcomp140.dll
unzip -l scikit_learn-0.24.dev0-cp36-cp36m-win_amd64.whl  | grep vcomp
   177032  09-16-2020 12:15   sklearn/.libs/vcomp140.dll
unzip -l scikit_learn-0.24.dev0-cp37-cp37m-win_amd64.whl  | grep vcomp
   177032  09-16-2020 12:15   sklearn/.libs/vcomp140.dll
unzip -l scikit_learn-0.24.dev0-cp37-cp37m-win_amd64.whl  | grep vcomp
   177032  09-16-2020 12:15   sklearn/.libs/vcomp140.dll
unzip -l scikit_learn-0.24.dev0-cp38-cp38-win_amd64.whl  | grep vcomp
   177032  09-16-2020 12:15   sklearn/.libs/vcomp140.dll
unzip -l scikit_learn-0.24.dev0-cp39-cp39-win_amd64.whl  | grep vcomp
   177032  09-16-2020 12:15   sklearn/.libs/vcomp140.dll

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing @alfaro96 !

CIBW_TEST_REQUIRES: pytest pandas threadpoolctl
# Test that there are no links to system libraries
CIBW_TEST_COMMAND: pytest --pyargs sklearn &&
python -m threadpoolctl -i sklearn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need code to check the output of python -m threadpoolctl -i sklearn?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could write another helper python script that does this using the Python API of threadpoolctl instead of the command line.

I still think it's good to print it on stdout for manual checks and debugging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way I manually checked the output of one of the last green build for windows below and it looks good:

 [
  {
    "filepath": "C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpnijjd_bt\\lib\\site-packages\\sklearn\\.libs\\vcomp140.dll",
    "prefix": "vcomp",
    "user_api": "openmp",
    "internal_api": "openmp",
    "version": null,
    "num_threads": 2
  },
  {
    "filepath": "C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpnijjd_bt\\lib\\site-packages\\numpy\\.libs\\libopenblas.NOIJJG62EMASZI6NYURL6JBKM4EVBGM7.gfortran-win_amd64.dll",
    "prefix": "libopenblas",
    "user_api": "blas",
    "internal_api": "openblas",
    "version": "0.3.9.dev",
    "num_threads": 2,
    "threading_layer": "pthreads"
  },
  {
    "filepath": "C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpnijjd_bt\\lib\\site-packages\\scipy\\.libs\\libopenblas.3HBPCJB5BPQGKWVZAVEBXNNJ2Q2G3TUP.gfortran-win_amd64.dll",
    "prefix": "libopenblas",
    "user_api": "blas",
    "internal_api": "openblas",
    "version": "0.3.9",
    "num_threads": 2,
    "threading_layer": "pthreads"
  }
]

@alfaro96
Copy link
Member Author

alfaro96 commented Nov 5, 2020

Is there anything else to do?

@ogrisel
Copy link
Member

ogrisel commented Nov 5, 2020

Is there anything else to do?

Addressing https://github.com/scikit-learn/scikit-learn/pull/17921/files#r518173353 would be a nice to have but I am fine with merging this PR as it is and further improve it other PRs.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! BTW the way, I just defined the two upload tokens as github secrets on the scikit-learn repository to hopefully have the anaconda upload work as expected.

@alfaro96
Copy link
Member Author

alfaro96 commented Nov 5, 2020

Is there anything else to do?

Addressing https://github.com/scikit-learn/scikit-learn/pull/17921/files#r518173353 would be a nice to have but I am fine with merging this PR as it is and further improve it other PRs.

Indeed, I still want to provide the wheels for aarch64 using travis, but I need to refactor the .travis.yml file and I think that it would be better to address in a separate PR.

Copy link
Member Author

@alfaro96 alfaro96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought.

@ogrisel
Copy link
Member

ogrisel commented Nov 6, 2020

@thomasjpfan @rth shall we merge this to test it a bit "for real" and then open new PRs with improvements (e.g. ARM builds, additional sanity checks...)

@rth
Copy link
Member

rth commented Nov 6, 2020

I haven't reviewed, but sound good to me. Great work @alfaro96 !

@ogrisel
Copy link
Member

ogrisel commented Nov 6, 2020

@rth I pinged you because you marked this PR as 1 change requested a while ago but I believe your concerns have been addressed. Let's wait for @thomasjpfan then.

@ogrisel
Copy link
Member

ogrisel commented Nov 6, 2020

Oh I now see that it's an artifact of the lack of cancel review button on github... Sorry for the noise.

@alfaro96
Copy link
Member Author

alfaro96 commented Nov 6, 2020

@thomasjpfan @rth shall we merge this to test it a bit "for real" and then open new PRs with improvements (e.g. ARM builds, additional sanity checks...)

That would be nice 😉.

Should we specify the [cd build] marker when merging or wait for the schedule job?

@ogrisel
Copy link
Member

ogrisel commented Nov 6, 2020

Let me merge it with the [cd build] to give it a full end to end run.

@ogrisel ogrisel merged commit cd60a35 into scikit-learn:master Nov 6, 2020
@alfaro96 alfaro96 deleted the wheel_builder branch November 6, 2020 14:07
@alfaro96
Copy link
Member Author

alfaro96 commented Nov 6, 2020

Okay, I forgot to add a trigger for push on the master branch and only build on it if the [cd build] commit trigger is found.

A quick fix is coming.

@ogrisel
Copy link
Member

ogrisel commented Nov 6, 2020

yes I just noticed... cd60a35.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate cibuildwheel Move this to the scikit-learn org?
7 participants