-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
CI: move Linux and MacOS Azure builds to conda lock files #22448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Note that the lock file in the development version of |
OK I think I saw some discussions about this but did not investigate further. Do you think we should wait for a As long as the diff between two lock files (with the same conda lock command but after one dependency has a new release) is reasonably readable (full disclosure: I have not checked maybe the order can change quite a bit). |
One of the nice thing of the conda-lock development release is the support for pip section in environment.yml with poetry (we may need this for some builds where we combine conda + pip, e.g. the one using the conda defaults channel, not 100% sure) |
There are few nice features in the dev version we can use. I would wait for their release. |
Can we install the dev version (e.g. a fixed commit hash) with a pip git URL in the mean time? |
I am seeing weird errors with the I think I can keep going moving more builds to |
As an alternative I could do a first PR using |
Also regenerate lock files
About the diff between lock files, you can see in d2a6dea#diff-708171dcd71ffc9f7ed79f676e2cb3dd1b923ba3e21319eb5e91613f12737120 that the diff is reasonably readable even if the order changes a bit, for example |
Since the bug has been fixed in the dev branch of conda-lock we can consider using the dev version again (installed from a fixed git hash for stability), in order to use the new features if needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the update of the Python version, the build_tools/azure/update_environments_and_lock_files.py
now runs on my M1 machine.
There are likely merge conflicts from: #23174
In my last commit, CI is green for all the nightly builds (including the pip-nogil build). Right now, the python-nogil lockfile is generated manually rather than with the |
+1 for keeping it manual for the nogil build for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with the handling of nogil. I tested it on m1 and it works as expected.
DOCKER_CONTAINER: 'i386/debian:11.2' | ||
JOBLIB_VERSION: 'min' | ||
DISTRIB: 'debian-32' | ||
LOCK_FILE: './build_tools/azure/debian_atlas_32bit_lock.txt' | ||
# disable pytest xdist due to unknown bug with 32-bit container | ||
PYTEST_XDIST_VERSION: 'none' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove PYTEST_XDIST_VERSION
as well? I do not think it is being used anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PYTEST_XDIST_VERSION
is still used in the test script:
scikit-learn/build_tools/azure/test_script.sh
Lines 65 to 66 in 7bdd426
if [[ "$PYTEST_XDIST_VERSION" != "none" ]]; then | |
TEST_CMD="$TEST_CMD -n$CPU_COUNT" |
Longer term, I was thinking about cleaning up environment variables like PYTEST_XDIST_VERSION
and COVERAGE
and check whether a package is installed instead of relying on a environment variable.
# The content below has been generated manually with the following command: | ||
# docker run -v $PWD:/scikit-learn -it nogil/python bash -c 'pip install pip-tools; pip-compile /scikit-learn/build_tools/azure/python_nogil_requirements.txt -o /scikit-learn/build_tools/azure/python_nogil_lock.txt' | ||
# | ||
# The reason behind it is that you need python-nogil to generate the pip lock | ||
# file. Using pip-compile --index and --extra-index will not work, for example | ||
# the latest cython will be picked up from PyPI, rather than the one from the | ||
# python-nogil index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I run the docker run -v $PWD:/scikit-learn ...
command this header gets replaced. Maybe we move this comment into python_nogil_requirements.txt
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe good enough for now? Longer term, I was thinking about using Docker to generate all the pip-based lock files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up doing what you suggested (I added the --upgrade
option as well which is needed to upgrade dependencies as I discovered recently)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Codewise LGTM
Assuming we have an auto updating bot that opens a PR once a week to bump versions, what do you see the workflow look like when the PR fails?
I guess a human has to investigate to understand the root cause of the issue. It might be simpler to fix it "by hand" (e.g. pin a package to a previous version) and regenerate the lock files with the script. Of course I am open to better suggestions. I am hoping that in most cases the PR updating the dependencies will pass. I had in mind to use the
|
Here is a concrete example: Imagine we have lock files in place, and pytest 6.2.5 was the latest locked version. The bot opens a PR trying to update to
In both cases, we end up closing the PR created by the bot. |
I am fine with your proposal.
It seems OK to me. Maybe you were hoping for a better workflow? I think the main advantage of the CI lock files is that we don't have to rush to fix it since other PRs and main will be unaffected by the issue. |
I'm on board with lock files. I'm trying to see what situations we end up with when we have lock files. Let's say a new version of NumPy and pytest are both causing errors. The most straight forward thing to do is to pin both of them in There are a few follow up items to this PR:
|
Yep. On top of the things you mention, I also had in mind to switch to lock files:
|
…rn#22448) Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org> Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Reference Issues/PRs
part of #22425
What does this implement/fix? Explain your changes.
This is a start:
Any other comments
Right now I am thinking that the bot functionality is decoupled from using lock files in all the builds and it could happen in a second step.