Skip to content

[MRG] Issue 7867: Install numpy and scipy using "install_requires" #10402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

barrywhart
Copy link
Contributor

@barrywhart barrywhart commented Jan 4, 2018

Alternate, simpler PR after feedback on #10398. We can iterate on this one but possibly fall back to the other one if this doesn't work out.

In my testing, installing from a .tar.gz appears to work (i.e. I can import the sklearn module afterwards) but there are some errors displayed during installation. I don't know if those are a concern or not. Full output below.

Installing from a wheel worked smoothly.

Installing from source tarball (Mac OS X Sierra, Python 2.7.13):

$ pip install scikit-learn/dist/scikit-learn-0.20.dev0.tar.gz
Processing ./scikit-learn/dist/scikit-learn-0.20.dev0.tar.gz
Collecting numpy>=1.8.2 (from scikit-learn==0.20.dev0)
  Downloading https://artifactory.rsglab.com/artifactory/api/pypi/pypi/packages/ea/df/f0671353e3d2eab4c87df014ad09505268561f6b09d0dc257d1b32653cfe/numpy-1.13.3-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (4.6MB)
Collecting scipy>=0.13.3 (from scikit-learn==0.20.dev0)
  Downloading https://artifactory.rsglab.com/artifactory/api/pypi/pypi/packages/4d/e4/e92135b070c0913cbee59849b61f57076ac33d8a754be0fef581a28676f9/scipy-1.0.0-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (16.8MB)
Building wheels for collected packages: scikit-learn
  Running setup.py bdist_wheel for scikit-learn: started
  Running setup.py bdist_wheel for scikit-learn: finished with status 'error'
  Complete output from command /Users/bhart/.pyenv/versions/2.7.13/envs/scikit-learn-2.7.13/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/private/var/folders/vd/j7yv3l2x0tb2t_7pvwrvhr2r0000gq/T/pip-GZj9kO-build/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /var/folders/vd/j7yv3l2x0tb2t_7pvwrvhr2r0000gq/T/tmp5WQUMIpip-wheel- --python-tag cp27:
  Partial import of sklearn during the build process.
  Traceback (most recent call last):
    File "/private/var/folders/vd/j7yv3l2x0tb2t_7pvwrvhr2r0000gq/T/pip-GZj9kO-build/setup.py", line 149, in get_numpy_status
      import numpy
  ImportError: No module named numpy
  Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/private/var/folders/vd/j7yv3l2x0tb2t_7pvwrvhr2r0000gq/T/pip-GZj9kO-build/setup.py", line 241, in <module>
      setup_package()
    File "/private/var/folders/vd/j7yv3l2x0tb2t_7pvwrvhr2r0000gq/T/pip-GZj9kO-build/setup.py", line 231, in setup_package
      .format(numpy_req_str, instructions))
  ImportError: Numerical Python (NumPy) is not installed.
  scikit-learn requires NumPy >= 1.8.2.
  Installation instructions are available on the scikit-learn website: http://scikit-learn.org/stable/install.html
  
  
  ----------------------------------------
  Running setup.py clean for scikit-learn
Failed to build scikit-learn
Installing collected packages: numpy, scipy, scikit-learn
  Running setup.py install for scikit-learn: started
    Running setup.py install for scikit-learn: still running...
    Running setup.py install for scikit-learn: finished with status 'done'
Successfully installed numpy-1.13.3 scikit-learn-0.20.dev0 scipy-1.0.0
$ python
Python 2.7.13 (default, Jun  8 2017, 17:00:51)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn
>>> import numpy
>>> import scipy
>>>

Fixes #10398, fix #7867.

@lesteve
Copy link
Member

lesteve commented Jan 4, 2018

To remove the errors from the log, I think you need to remove the numpy_status stuff like you removed the scipy one. Also maybe look at the whole file to spot opportunities for clean-up. Amongst other things there probably are some comments that don't make sense any more once this PR is merged.

@lesteve
Copy link
Member

lesteve commented Jan 4, 2018

Small github tip: I edited your PR description to use "Fix #issueNumber", this way the associated issue gets closed automatically when the PR is merged. For more details, look at this.

@barrywhart
Copy link
Contributor Author

@lesteve: About the errors, njsmith had suggested:

If import numpy doesn't work, then don't use setup_requires, just print an error and exit.

My question is where and how to do that in order to get the desired effect? I'm thinking as written currently, it is checking in some cases when it shouldn't. This seems like a tricky area requiring expertise in Python setuptools. I notice that both scikit-learn and scipy have some tricky command-line parsing in setup.py to try and determine whether certain steps are needed.

Specific suggestions would be appreciated.

@barrywhart
Copy link
Contributor Author

I looked and don't see any obvious opportunities for cleanup. We can take another look at that once bigger issues are addressed (e.g. the spurious error building a wheel during source installation).

@lesteve
Copy link
Member

lesteve commented Jan 12, 2018

I looked at this a bit more and I think this is fine as it is despite the spurious error:

  • this only happens when installing from the .tar.gz (and only when you have the wheels package installed). In practice, wheels are available on all platforms for recent releases.
  • the scikit-learn package ends up being installed correctly anyway

I feel the benefit of having scikit-learn installable via pip install scikit-learn in most cases outweighs by far the downside of having a spurious error in the middle of the installation that happens only in a few cases.

@jnothman
Copy link
Member

Why should we not keep the scipy check removed here?

@barrywhart
Copy link
Contributor Author

It shouldn't be needed anymore; install_requires will take care of installing it, so we'd just be trusting the package manager (i.e. pip) to work correctly.

@jnothman
Copy link
Member

Well, I've not done much packaging, but don't think this PR does harm, and will believe the experts. Let's merge.

@jnothman jnothman merged commit 66bf809 into scikit-learn:master Jan 14, 2018
@amueller
Copy link
Member

I'm so happy to see this. I'm pretty sure this solves some more open issues/PRs

@amueller
Copy link
Member

Do we want to add Cython as well? Or is install_requires not the right one for that?

@lesteve
Copy link
Member

lesteve commented Mar 20, 2018

Do we want to add Cython as well?

Probably not: we don't want people who do pip install scikit-learn to get cython.

@amueller
Copy link
Member

amueller commented Mar 20, 2018

Yeah we need a "build_requires", which unfortunately doesn't exist.

@amueller
Copy link
Member

Ideally we want people who do pip install scikit-learn for a platform where there is no wheel to get cython..., but not if there is a wheel....

@jtlz2
Copy link

jtlz2 commented May 30, 2018

@amueller e.g. Alpine Linux (trying to install scikit-learn on there now :, but running into this issue...).

@lesteve
Copy link
Member

lesteve commented May 31, 2018

Note: this feature (pip install scikit-learn installing automatically numpy and scipy) will be available when scikit-learn 0.20 is released.

@amueller
Copy link
Member

amueller commented Jun 2, 2018

OMG I forgot this happened. We should have a whatsnew entry, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Register build-time dependencies in the setup.py as a setup_requirement.
5 participants