Skip to content

[WIP] CI Build wheels for arm64 on circleci. #20711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 44 commits into from

Conversation

cmarmo
Copy link
Contributor

@cmarmo cmarmo commented Aug 8, 2021

Reference Issues/PRs

Follow up of #20460

What does this implement/fix? Explain your changes.

This pull request generates wheels for python 3.7, 3.8, 3.9 on arm64 architectures when the commit message contains [cd build].

@cmarmo
Copy link
Contributor Author

cmarmo commented Aug 8, 2021

Should I remove the travis build in this same pull request?
Also, I suppose the artifacts should be uploaded to Pypi at some time: I've checked #19060, but I will not have bandwidth for understanding how it works in the next weeks. It would be nice if the upload could be addressed in another pull request. Thank you very much for your understanding.

@ogrisel
Copy link
Member

ogrisel commented Aug 9, 2021

Should I remove the travis build in this same pull request?

I believe so. They currently error anyway.

The manually triggered github actions config that uploads the wheels stored in the anaconda.org staging area to PyPI should not be impacted by this PR.

However we will need to adapt the code that uploads the generated wheels to the anaconda.org staging area. In particular, I will have to configure the secrets environment variables with the anaconda.org tokens to have them available on our circle ci account.

@ogrisel
Copy link
Member

ogrisel commented Aug 9, 2021

For information I have configured the SCIKIT_LEARN_STAGING_UPLOAD_TOKEN and SCIKIT_LEARN_NIGHTLY_UPLOAD_TOKEN environment variables on Circle CI.

They should automatically be available for this build config but only on the merged pull request (not in the builds that happen when pushing commits to a PR for security reasons).

@cmarmo cmarmo changed the title CI Build wheels for arm64 on circleci. [WIP] CI Build wheels for arm64 on circleci. Aug 10, 2021
@ogrisel
Copy link
Member

ogrisel commented Sep 6, 2021

pfff, I hate this. It seems to work though, I might re-enable the full build.

However I have the feeling that using circle ci to sequentially build and test all the linux/arm64 wheels is going to be too slow on a single free circle ci executor. Maybe we should manage to find a way to fix the travis linux/arm64 wheel build instead.

@ogrisel
Copy link
Member

ogrisel commented Sep 6, 2021

See concurrent experiments in #20958.

@alfaro96
Copy link
Member

alfaro96 commented Sep 6, 2021

pfff, I hate this. It seems to work though, I might re-enable the full build.

However I have the feeling that using circle ci to sequentially build and test all the linux/arm64 wheels is going to be too slow on a single free circle ci executor. Maybe we should manage to find a way to fix the travis linux/arm64 wheel build instead.

I have been a little bit out these months, what is happening with the travis wheel builder?

Maybe we can figure out a solution.

@cmarmo
Copy link
Contributor Author

cmarmo commented Sep 7, 2021

Hi everybody, it seems that this PR is suddenly becoming urgent?
I was indeed working on how to select nightly builds OR [cd build]. But apparently this is not necessary?
My opinion is that this PR could be merged without the uploading artifact step (then without the check in check_builds).
Small steps allow sometimes to move forward with less pain.
This implies that wheels for linux arm64 should be uploaded manually ... this is not the only thing manually done for the release....

Anyway, let me know if I can still do something or you prefer take over. Thanks.

@ogrisel
Copy link
Member

ogrisel commented Sep 7, 2021

I was indeed working on how to select nightly builds OR [cd build]. But apparently this is not necessary?

This is not needed for the 1.0 release we are preparing. But would be useful later.

My opinion is that this PR could be merged without the uploading artifact step (then without the check in check_builds).

The upload step should work now and is needed to get the release automation working. Apparently there is just a missing PATH in my last commit:

WARNING: The script cibuildwheel is installed in '/home/circleci/mambaforge/bin' which is not on PATH.

I will push a fix.

My main concern however is that the free account circleci has only one executor that is already under heavy load to build the doc when when we merge PRs (especially at release time). So I was trying concurrently to fix the linux/arm64 wheel building config on travis to avoid loading circleci to avoid making the CI even slower to iterate with, especially during a release process.

@ogrisel
Copy link
Member

ogrisel commented Sep 7, 2021

I have been a little bit out these months, what is happening with the travis wheel builder?

It seems to be caused by output buffering issues by travis itself. I am trying to find a workaround in #20958.

@ogrisel
Copy link
Member

ogrisel commented Sep 7, 2021

This Circle CI config probably needs tuning of the number of OPENBLAS/OMP_NUM_THREADS and build test pytest parallelism. However since we only have 1 executor, it will never be as fast as the travis build that I have just fixed in #20958: it only takes less than 9min because of the job level parallelism and innerly tuned parallelism.

So I would be in favor of merging #20958 and closing this (#20711).

We can later clean-up the travis config in follow-up PRs but at least the release process should be working and fast now.

ogrisel added a commit to scikit-learn-inria-fondation/follow-up that referenced this pull request Sep 7, 2021
## August 31th, 2021

### Gael

* TODO: Jeremy's renewal, Chiara's replacement, Mathis's consulting gig

### Olivier

- input feature names: main PR [#18010](scikit-learn/scikit-learn#18010) that links into sub PRs
  - remaining (need review): [#20853](scikit-learn/scikit-learn#20853) (found a bug in `OvOClassifier.n_features_in_`)
- reviewing `get_feature_names_out`: [#18444](scikit-learn/scikit-learn#18444)
- next: give feedback to Chiara on ARM wheel building [#20711](scikit-learn/scikit-learn#20711) (needed for the release)
- next: assist Adrin for the release process
- next: investigate regression in loky that blocks the cloudpickle release [#432](cloudpipe/cloudpickle#432)
- next: come back to intel to write a technical roadmap for a possible collaboration

### Julien

 - Was on holidays
 - Planned week @ Nexedi, Lille, from September 13th to 17th
 - Reviewed PRs
     - [`#20567`](scikit-learn/scikit-learn#20567) Common Private Loss module
     - [`#18310`](scikit-learn/scikit-learn#18310) ENH Add option to centered ICE plots (cICE)
     - Others PRs prior to holidays
 - [`#20254`](scikit-learn/scikit-learn#20254)
     - Adapted benchmarks on `pdist_aggregation` to test #20254 against sklearnex
     - Adapting PR for `fast_euclidean` and `fast_sqeuclidean` on user-facing APIs
     - Next: comparing against scipy's 
     - Next: Having feedback on [#20254](scikit-learn/scikit-learn#20254) would also help
- Next: I need to block time to study Cython code.

### Mathis
- `sklearn_benchmarks`
  - Adapting benchmark script to run on Margaret
  - Fix issue with profiling files too big to be deployed on Github Pages
  - Ensure deterministic benchmark results
  - Working on declarative pipeline specification
  - Next: run long HPO benchmarks on Margaret

### Arturo

- Finished MOOC!
- Finished filling [Loïc's notes](https://notes.inria.fr/rgSzYtubR6uSOQIfY9Fpvw#) to find questions with score under 60% (Issue [#432](INRIA/scikit-learn-mooc#432))
    - started addressing easy-to-fix questions, resulting in gitlab MRs [#21](https://gitlab.inria.fr/learninglab/mooc-scikit-learn/mooc-scikit-learn-coordination/-/merge_requests/21) and [#22](https://gitlab.inria.fr/learninglab/mooc-scikit-learn/mooc-scikit-learn-coordination/-/merge_requests/22)
    - currently working on expanding the notes up to 70%
- Continued cross-linking forum posts with issues in GitHub, resulting in [#444](INRIA/scikit-learn-mooc#444), [#445](INRIA/scikit-learn-mooc#445), [#446](INRIA/scikit-learn-mooc#446), [#447](INRIA/scikit-learn-mooc#447) and [#448](INRIA/scikit-learn-mooc#448)

### Jérémie
- back from holidays, catching up
- Mathis' benchmarks
- trying to find what's going on with ASV benchmarks
  (asv should display the versions of all build and runtime depndencies for each run)

### Guillaume

- back from holidays
- Next:
    - release with Adrin
    - check the PR and issue trackers

### TODO / Next

- Expand Loïc’s notes up to 70% (Arturo)
- Create presentation to discuss my experience doing the MOOC (Arturo)
- Help with the scikit-learn release (Olivier, Guillaume)
- HR: Jeremy's renewal, Chiara's replacement (Gael)
- Mathis's consulting gig (Olivier, Gael, Mathis)
@ogrisel
Copy link
Member

ogrisel commented Sep 7, 2021

@glemaitre suggested that we could keep using circleci but only once a day for the scheduled nightly job (to upload the wheels for the main branch to https://anaconda.org/scipy-wheels-nightly/scikit-learn/files every night). Build time is not a problem since it's not an interactive process.

And we could keep the travis config only for the release process with the manual [cd build] triggers to the main or release branches.

This way we could spare the limited free travis credits to dedicate them to release efforts.

If we agree on this plan, we would need to:

@cmarmo
Copy link
Contributor Author

cmarmo commented Sep 7, 2021

Thanks @ogrisel for your clarifications. I'm sorry to say that I will not have enough bandwidth to follow-up on this PR then.
Feel free to close it and choose the path you prefer for the next step.

@ogrisel ogrisel removed this from the 1.0 milestone Sep 7, 2021
@ogrisel
Copy link
Member

ogrisel commented Sep 16, 2021

I light of the recent security debacle at travis and their poor way of handling the matter we have no choice but to move forward with completely migrating all the travis ARM builds to circle ci. And we will probably have to pay for the service to increase the concurrency a bit to make it fast enough when releasing.

@alfaro96
Copy link
Member

@ogrisel
Copy link
Member

ogrisel commented Sep 16, 2021

I think the first step for this plan would be to migrate the doc building jobs out of circle ci, for instance, to a new workflow on github actions.

This workflow could run:

  • in cheap "doc diff mode" in PRs (as done today on circle ci) to only run the examples impacted by the PR
  • in full doc build mode using the [doc build] commit flag in PRs as done today
  • in full doc build mode on main with upload to dev using a nightly schedule trigger (instead of at each merge commit on main)
  • in full doc build mode on main with upload to /dev or on release branches with upload to /X.Y with a manual trigger in the github actions UI.

This way we should free the overworked circle ci executor to be able to focus on ARM builds. And if we decide to buy some circle ci build credits, they will be more wisely allocated to the scarce ARM building resources rather than being wasted on doc building which can be done more efficiently on github actions.

Edit: BTW I did rotate our secret anaconda.org upload tokens for this time, so there is no emergency but it would be safer for the future.

@cmarmo
Copy link
Contributor Author

cmarmo commented Jun 18, 2022

Just linking here the CircleCI documentation about the improved Arm execution environment.... if this can be useful.

@jjerphan
Copy link
Member

Thank you for the pointer, @cmarmo. Do you want and/or have time to pursue this PR?

@cmarmo
Copy link
Contributor Author

cmarmo commented Jun 25, 2022

Thank you for the pointer, @cmarmo. Do you want and/or have time to pursue this PR?

@jjerphan , to be honest it is not clear to me if this feature is wanted or not... in general, I seem to have a different approach to CI pull requests with respect to what is expected by historical contributors.
I guess this PR is starting to be obsolete anyway.
I was just pointing out that, if performances were the issue, perhaps the new CircleCI Arm infrastructure could be helpful.

@cmarmo
Copy link
Contributor Author

cmarmo commented Dec 12, 2022

I am closing this one as obsolete.
Also, it looks like Cirrus is the new black (#25048).

@cmarmo cmarmo closed this Dec 12, 2022
@cmarmo cmarmo deleted the circle-cd-build-arm branch December 12, 2022 04:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants