-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] BLD Specify build time dependencies via pyproject.toml #16244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I agree it would be good to merge this some time before the release. Is anything holding this? |
We need to decide if we do want this. The main concern is editable mode vs build isolation. With a "pyproject.toml", So in developer mode, the new way to do is:
Where I see the value of adding "pyproject.toml" is for people who install sklearn from source distribution. Today, one needs to install cython first otherwise pip will fail. But usually you'll never have to build sklearn again in that case, so here an isolated build makes totally sense. |
To keep track of previous discussions, the behavior reported in #15533 (comment) can't be reproduced. We tried it again with olivier and the build is still isolated. |
Are you sure -- do you have a ref? From my understanding of https://pip.pypa.io/en/stable/reference/pip/#pep-517-and-518-support, scikit-learn woudn't be entirely re-built. The worst thing that I guess could happen is that the build dependencies are still installed again in the temp dir, even if they are already satisfied. Though I hope that pip is smart enough to avoid that? I.e. I assume that merging this PR wouldn't change our workflow since we would still manually install cython numpy scipy and then |
tested it with different versions of pip from 10.X to 20.X. In verbose mode you can see the compilation happening twice when running
No, as I said,
|
There's a benefit beyond pulling in cython automatically; pip creates and caches wheels so it doesn't have to be rebuilt every time scikit-learn is installed. |
Indeed. This is because numpy is installed in the temp dir and Cython believes that it needs to recompile all its files that have a I think that if pip didn't installed numpy in the temp dir, the compilation issue would disappear. I've opened an issue on https://discuss.python.org (currently going through the spam filter, will link to it when available). EDIT: link to post |
I think it's due to cython being installed in the temp dir.
The detected change is in cython's header for numpy c api. It also occurs for non numpy stuff:
The problem is that we can't expect to have cython in the original env. |
Maybe Cython could be made smarter by using a sha256 hash of the the dependent source files instead of file timestamps? |
Looking at the source code of Cython, I think only mtime-based change detection is implemented: https://github.com/cython/cython/blob/master/Cython/Build/Dependencies.py |
I have the feeling that we will need to use the |
yeah I guess that's fine. It will only impact contributors that modify Cython files and that can't use We'd need to update the note in https://scikit-learn.org/stable/developers/contributing.html#how-to-contribute to introduce the |
If we want to service those who does not have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Maybe worth pinging other core devs for completeness
build_tools/azure/install.sh
Outdated
# to speed-up the building multicore CI machines. | ||
python setup.py build_ext --inplace -j 3 | ||
python setup.py develop | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NicolasHug you did not finish your comment, or am I missing something ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there was a missing line return but it's been fixed (it showed on the main diff, maybe not in the PR comments page)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I see. I must have fixed that in a recent commit indeed :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think using --no-build-isolation
by contributors is a modest price to pay for the advantages this could bring.
If we want to service those who does not have make, we can advise contributors to: python setup.py build_ext -i (and with -j if they want to do it faster).
I am sad there is no equivalent of -j
for pip install
(or at least I haven't found one), but I was under the impression that python setup.py ...
use is now discouraged, and I'm also not so keen on make in
as that's not universal with respect to other projects.
pip install --no-build-isolation -e .
is still the most standard way IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as well.
+1 for keeping the message simple and only mentioning the pip install --no-build-isolation -e .
command as it's cross-platform.
pyproject.toml
Outdated
"wheel", | ||
"Cython>=0.28.5", | ||
"numpy>=1.11.0", | ||
"scipy>=0.17.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#15106 will be merged soon. Let's bump-up numpy to 1.13.3 and scipy to 0.19.1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merged :)
BTW the discussion is going on at https://discuss.python.org/t/pep-518-and-editable-mode-dont-install-already-satisfied-dependencies/3124 if you are interested in contributing to pip to improve incremental building by reusing an existing build env instead of creating a new one each time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @jeremiedbb
@@ -5,3 +5,4 @@ recursive-include sklearn *.c *.h *.pyx *.pxd *.pxi *.tp | |||
recursive-include sklearn/datasets *.csv *.csv.gz *.rst *.jpg *.txt *.arff.gz *.json.gz | |||
include COPYING | |||
include README.rst | |||
include pyproject.toml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed because we need this file in the sdist
? Otherwise in the binary releases we don't need to have it. Is my understanding correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jeremiedbb
…16244) * add pyproject.toml * do it in CI * cln * no build isolation in the doc * bump dependencies * no-build-isolation everywhere needed
…16244) * add pyproject.toml * do it in CI * cln * no build isolation in the doc * bump dependencies * no-build-isolation everywhere needed
Fixes #15298
I basically took all what's done by adrin in #15533 but the cythonizing part.
That was discussed in several previous PRs (#15335, #15533, #15567) but we decided to not add it right before a release. We now have more time before the next release. Let's not delay that to the last minute.
It can also let us use external build tools like
extension-helpers
(#15917) oroldest-supported-numpy
(https://pypi.org/project/oldest-supported-numpy/).