Skip to content

MNT simplify pyproject.toml by using oldest-supported-numpy #18900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 6, 2021

Conversation

terencehonles
Copy link
Contributor

@terencehonles terencehonles commented Nov 23, 2020

Reference Issues/PRs

Follows up on comment by @alfaro96 #18761 (comment)

What does this implement/fix? Explain your changes.

Uses oldest-supported-numpy as it now matches the versions specified in pyproject.toml (@master but not released, see: scipy/oldest-supported-numpy#13)

On the comment #18761 (comment) we resolved we may not be able to upstream the PyPy requirements, but I realized we could constrain oldest-supported-numpy to only be used when not matching PyPy.

Any other comments?

This should be merged only pending a new release of oldest-supported-numpy. I'm asking about the release schedule now.

@terencehonles
Copy link
Contributor Author

terencehonles commented Nov 23, 2020

Thoughts @alfaro96 / @ogrisel ? This is just marked as a draft so it doesn't get merged before the updates are released to PyPI.

@ogrisel you had raised that this project may not want to match the exact versions from oldest-supported-numpy? #18761 (comment)

pyproject.toml Outdated
# platform_python_implementation!='CPython' not needed >= Python 3.7 which is numpy 1.14.5
"numpy==1.13.3; python_version=='3.6' and platform_machine!='aarch64' and platform_system!='AIX' and platform_python_implementation=='CPython'",
"numpy==1.14.0; python_version=='3.6' and platform_machine!='aarch64' and platform_system!='AIX' and platform_python_implementation!='CPython'",
"numpy==1.14.0; python_version=='3.6' and platform_machine!='aarch64' and platform_system!='AIX' and platform_python_implementation=='PyPy'",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this to specifically call out PyPy since this is what PyPy reports.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #18879, I have realized that numpy==1.19.0 has to be used for PyPy (the wheels are available from this version). Otherwise, numpy is compiled from source and this is something that we should avoid.

WDYT?

Copy link
Contributor Author

@terencehonles terencehonles Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change doesn't change anything there so it's not problematic immediately (not a regression and might not be something that should block this PR). I do think that if I were to bump this here I'd need to also add your change here 9c27b7e#diff-83a4b90ed0c61f5e1a834555270a9d1e687cd4ece074262fcd18e7fcf160e6d5L9-R9 and it might make sense to bump the version in #18879 unless you think that will merge before this (and I could update this to match master)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyPy support PR for oldest-supported-numpy: scipy/oldest-supported-numpy#15

I have realized that numpy==1.19.0 has to be used for PyPy (the wheels are available from this version). Otherwise, numpy is compiled from source and this is something that we should avoid.

That sounds right to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alfaro96 did you want me to bump the min supported numpy for PyPy to be 1.19.0? As @rgommers mentions the PyPy spec of 1.19.0 has been upstreamed to oldest-supported-numpy. If we're going to be bumping the version here I might as well simplify this all to just use that with no constraints.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we could rely on oldest-supported-numpy for PyPy. Then we would not need to keep track of the NumPy wheels available for each architecture (as @rth stated in #18900 (comment)).

Copy link
Member

@alfaro96 alfaro96 Dec 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, I think that we need to update the minimum version of NumPy for PyPy in sklearn/_min_dependencies.py.

CC @thomasjpfan.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, I think that we need to update the minimum version of NumPy for PyPy in sklearn/_min_dependencies.py.

CC @thomasjpfan.

That's actually what I meant about bumping the min supported numpy 😅

@rgommers
Copy link
Contributor

@ogrisel you had raised that this project may not want to match the exact versions from oldest-supported-numpy? #18761 (comment)

Let me post a couple of corrections to that comment:

  • It's meant to really be the oldest version of NumPy that will build (or have numpy wheels, like for aarch64) for a given Python version and platform, not updating in sync with NEP 29. So it would surprise me if there's versions there that are higher than scikit-learn has in its pyproject.toml
  • It's not by NumPy devs, it came from Astropy originally and is now hosted under the SciPy org as a common resource.

Scikit-learn has more resources that most projects, so I won't advocate for or against this PR, but I hope that oldest-supported-numpy would at least in principle work - if not I'd like to learn why.

@terencehonles terencehonles marked this pull request as ready for review November 24, 2020 18:17
@terencehonles
Copy link
Contributor Author

oldest-supported-numpy has just released a version which includes the aarch64 changes 🎉

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @terencehonles !

With this change we will be delegating this responsibility to oldest-support-numpy. This kind of pushes pyproject.toml as the "source of truth" for specifying the version of our build dependencies, which I am okay with.

@terencehonles May you push a commit with [cd build] in the commit message to trigger the wheel building pipeline?


# ARM needs numpy >= 1.19.0
# platform_machine!='aarch64' not needed >= Python 3.9 which is numpy 1.19.3
"numpy==1.19.0; python_version=='3.6' and platform_machine=='aarch64'",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reference, this is 1.19.2 in https://github.com/scipy/oldest-supported-numpy/blob/master/setup.cfg but I am okay with this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a result of @rgommers comment here scipy/oldest-supported-numpy#13 (comment)

@terencehonles terencehonles force-pushed the simplify-pyproject.toml branch 3 times, most recently from e219e84 to de8a72c Compare November 25, 2020 01:09
@terencehonles
Copy link
Contributor Author

terencehonles commented Nov 25, 2020

@thomasjpfan they look like they built properly.

The [cd build] tag needs to be in the commit message subject and the cp38-win32 builder appeared to be flakey. Not sure if that's a known issue or something was just acting up.

See: https://github.com/scikit-learn/scikit-learn/runs/1450732587?check_suite_focus=true

@rth
Copy link
Member

rth commented Nov 25, 2020

I'm also not opposed to outsource the decision on the minimal supported numpy version to the oldest-supported-numpy package, assuming

  • the versions used are conservative
  • there is a guarantee that binary wheels for those combinations of Python, numpy versions and architecture exist.
  • that once defined the minimal numpy for a given python version remains fixed for x86_64 (we still need to manually indicate the minimal supported numpy version in the documentation), excluding pypy and more exotic architectures, where I understand that other constraints could apply.

Which seems to be the case if I understand #18900 (comment) correctly

We would need more feedback before approving this PR though. cc @jeremiedbb @ogrisel @NicolasHug @glemaitre @alfaro96

@rth
Copy link
Member

rth commented Nov 25, 2020

Also I think the main selling point of this PR is not that it simplifies the pyproject.toml but that we wouldn't have to keep track of what numpy wheels are available for what architectures or python version (which also implies that a given numpy version supports those python versions/architectures).

@terencehonles
Copy link
Contributor Author

Updated based on #18900 (comment), and I bumped the min numpy version for PyPy as per the change in #18879 (see #18900 (comment) ).

I realized that oldest-supported-numpy needs to be released as of scipy/oldest-supported-numpy#15 (I asked yesterday, but will give them some time)

@ogrisel
Copy link
Member

ogrisel commented Dec 3, 2020

It seems good to me, thanks for PR and thanks @rgommers for the explanation. It seems it will indeed reduce maintenance efforts.

My only concern is, would it be possible to get inconsistent build configuration between this dependency in pyproject.toml and NUMPY_MIN_VERSION in sklearn/_min_dependencies.py (although the later is more of runtime dependency). For instance, we could be in a situation where scikit-learn would require a more recent version of numpy to be able to build than what oldest-supported-numpy specifies.

But our nightly builds should be able to detect those cases if they happen and maybe we can adapt on the fly.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for merging once a release of oldest-supported-numpy with scipy/oldest-supported-numpy#15 is online.

@alfaro96
Copy link
Member

alfaro96 commented Dec 3, 2020

Note: We should remember to trigger the Wheel builder workflow before merging the PR to ensure that it is working as expected (the right version of numpy is selected).

@rgommers
Copy link
Contributor

rgommers commented Dec 3, 2020

My only concern is, would it be possible to get inconsistent build configuration between this dependency in pyproject.toml and NUMPY_MIN_VERSION in sklearn/_min_dependencies.py (although the later is more of runtime dependency). For instance, we could be in a situation where scikit-learn would require a more recent version of numpy to be able to build than what oldest-supported-numpy specifies.

The only situation where I can imagine that happening is for more exotic platforms where oldest-supported-numpy could bump to a more recent version if a NumPy wheel for it appears (like for PyPy). But in that case updating NUMPY_MIN_VERSION seems desirable anyway.

By the way, NUMPY_MIN_VERSION doesn't accurately track minimum possible versions anyway - it's just "something really old, so we get all supported NumPy versions for a given Python version". The current default of 1.13.3 is so old that that NumPy version doesn't support Python >= 3.7. So in effect it's a do-nothing constraint.

@terencehonles
Copy link
Contributor Author

terencehonles commented Dec 3, 2020

It seems good to me, thanks for PR and thanks @rgommers for the explanation. It seems it will indeed reduce maintenance efforts.

My only concern is, would it be possible to get inconsistent build configuration between this dependency in pyproject.toml and NUMPY_MIN_VERSION in sklearn/_min_dependencies.py (although the later is more of runtime dependency). For instance, we could be in a situation where scikit-learn would require a more recent version of numpy to be able to build than what oldest-supported-numpy specifies.

But our nightly builds should be able to detect those cases if they happen and maybe we can adapt on the fly.

@ogrisel should the build tooling emit the version pulled in by oldest-supported-numpy? That seems like it could be added in a separate effort, but when building wheels there would be enough context to fix the issue that @rgommers points out:

By the way, NUMPY_MIN_VERSION doesn't accurately track minimum possible versions anyway - it's just "something really old, so we get all supported NumPy versions for a given Python version". The current default of 1.13.3 is so old that that NumPy version doesn't support Python >= 3.7. So in effect it's a do-nothing constraint.

That way it would be at least correct at build time for the wheel, but obviously it wouldn't necessarily match a rebuild if oldest-supported-numpy changes (which is something you might be pointing out and also mentioned by @rgommers 's first comment)

@thomasjpfan
Copy link
Member

@ogrisel should the build tooling emit the version pulled in by oldest-supported-numpy?

Moving foward, we can link to oldest-supported-numpy to show the support versions of numpy. The matrix of oldest supported numpy has gotten bigger as platforms are added. Technically, we can extract the metadata from oldest-supported-numpy setup.cfg file if we really need them.

My only concern is, would it be possible to get inconsistent build configuration between this dependency in pyproject.toml and NUMPY_MIN_VERSION in sklearn/_min_dependencies.py (although the later is more of runtime dependency). For instance, we could be in a situation where scikit-learn would require a more recent version of numpy to be able to build than what oldest-supported-numpy specifies.

sklearn/_min_dependencies.py was developed to have a "single source of truth" for managing our dependencies. With this PR, we are moving toward using pyproject.toml to have this metadata. If this is the case, I would want to use pyproject.toml as the "single source of truth". I will see what I can do on this end.

On a side note, once https://www.python.org/dev/peps/pep-0621/ gets approved, we can move most of the metadata into pyproject.toml. Specifically the optional-dependencies would be useful for "test" or "doc" requirements.

@terencehonles
Copy link
Contributor Author

terencehonles commented Dec 5, 2020

@ogrisel should the build tooling emit the version pulled in by oldest-supported-numpy?

Technically, we can extract the metadata from oldest-supported-numpy setup.cfg file if we really need them.

That's basically what I was suggesting, but when building the wheels you'd have that version installed and you could just save the numpy version. Otherwise you'd have to fetch the package or source and parse setup.cfg which the package manager would have already done for you when "installing" oldest-supported-numpy.

My question is more of: "is this necessary" / "of value"? If linking to setup.cfg is sufficient (as you asked me to add in a comment, and I did) then sklearn/_min_dependencies.py doesn't "have" to be maintained (I'm not sure how it's used).

@alfaro96 alfaro96 mentioned this pull request Dec 7, 2020
12 tasks
@terencehonles
Copy link
Contributor Author

Looks like oldest-supported-numpy has been released scipy/oldest-supported-numpy#15 (comment)

@ogrisel
Copy link
Member

ogrisel commented Dec 15, 2020

@ogrisel should the build tooling emit the version pulled in by oldest-supported-numpy?

That would be great. At leas the scikit-learn setup.py could just print the numpy and scipy versions on stdout so that we can have a look at the build logs on the CI.

My concern is who can we find the version of numpy was used to build a given, old, version of the scikit-learn wheel. Maybe we could embed this info into some kind of generated private python submodule of sklearn package and shipped into the wheel. Ideally this would not part of the sdist / .tar.gz distribution, just the wheels and conda packages.

But we can do this improvement in a later PR.

@ogrisel
Copy link
Member

ogrisel commented Dec 15, 2020

+1 for merging this PR as it is if the [cd build] above is green. However I would not backport this to 0.24.X right away, just to keep the 0.24.X as stable as possible.

Between 0.24.0 and 1.0.0 we will have plenty of time to see if this new way to define the build dependencies overtime. In particular I am not sure how it will impact the scikit-learn-feedstock on conda-forge.

@rgommers
Copy link
Contributor

My concern is who can we find the version of numpy was used to build a given, old, version of the scikit-learn wheel. Maybe we could embed this info into some kind of generated private python submodule of sklearn package and shipped into the wheel.

numpy.distutils puts this kind of stuff in pkgname/__config__.py and adds a show_config function, and it's quite handy. We have wanted the numpy build version in scipy/__config__.py for a long time (it's not there currently). Using the same names for scikit-learn (whether or not generated with numpy.distutils could be useful. The make_config_py method in numpy/distutils/misc_util.py generates the file.

@terencehonles
Copy link
Contributor Author

+1 for merging this PR as it is if the [cd build] above is green. However I would not backport this to 0.24.X right away, just to keep the 0.24.X as stable as possible.

The codecov diff failure looks to be the only thing not passing and I believe I had previously ran [cd build] so I had expected it not to fail.

Base automatically changed from master to main January 22, 2021 10:53
@terencehonles
Copy link
Contributor Author

Was there anything that needs to be updated here?

@cmarmo
Copy link
Contributor

cmarmo commented Feb 5, 2021

Was there anything that needs to be updated here?

Hi @terencehonles thanks for your patience! Do you mind synchronizing with upstream? Renaming the main branch made some checks to fail. Thanks!

@terencehonles terencehonles force-pushed the simplify-pyproject.toml branch from c7da13e to d6b4348 Compare February 5, 2021 18:25
@terencehonles
Copy link
Contributor Author

Was there anything that needs to be updated here?

Hi @terencehonles thanks for your patience! Do you mind synchronizing with upstream? Renaming the main branch made some checks to fail. Thanks!

Done, but I believe they may have already failed before and I believe they were just CI errors (but it looks like at least the one I checked its history is lost so we'll see after this run).

@alfaro96
Copy link
Member

alfaro96 commented Feb 6, 2021

Was there anything that needs to be updated here?

Hi @terencehonles thanks for your patience! Do you mind synchronizing with upstream? Renaming the main branch made some checks to fail. Thanks!

Done, but I believe they may have already failed before and I believe they were just CI errors (but it looks like at least the one I checked its history is lost so we'll see after this run).

The rest of failures are unrelated with this PR. See: #19316.

Therefore, LGTM 😉.

@ogrisel
Copy link
Member

ogrisel commented Feb 6, 2021

Ok let's give it a try. We still have to to revert it before 1.0 if we discover problems in the mean time.

@ogrisel ogrisel merged commit d6bd7be into scikit-learn:main Feb 6, 2021
@terencehonles terencehonles deleted the simplify-pyproject.toml branch February 10, 2021 00:43
@glemaitre glemaitre mentioned this pull request Apr 22, 2021
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants