Skip to content

PyPy in docker with Jupyter with 0.20.rc1 #11971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
giodegas opened this issue Sep 2, 2018 · 25 comments · Fixed by #12007 or #12039
Closed

PyPy in docker with Jupyter with 0.20.rc1 #11971

giodegas opened this issue Sep 2, 2018 · 25 comments · Fixed by #12007 or #12039

Comments

@giodegas
Copy link

giodegas commented Sep 2, 2018

Running

pip install --pre scikit-learn

in a docker container with

docker pull giodegas/pypy-jupyter:scipy

The error:

/usr/local/site-packages/sklearn/externals/joblib/externals/loky/backend/compat.py in <module>()
     10 if sys.version_info[:2] >= (3, 3):
     11     import queue
---> 12     from _pickle import PicklingError
     13 else:
     14     import Queue as queue

ImportError: No module named '_pickle' 

shows up.
(you may not get the latest updated container from the docker hub site, since it is slow, I am working with a local copy).

@giodegas
Copy link
Author

giodegas commented Sep 2, 2018

As described here, it should be:

import pickle

@giodegas
Copy link
Author

giodegas commented Sep 2, 2018

If you find missing parts/libraries in my container, please open issues at the source repo:

giodegas/docker-pypy-jupyter

@rth
Copy link
Member

rth commented Sep 2, 2018

Thanks for the report! Indeed there was an update of the vendored version of joblib which includes loky since the PyPy PR was merged. This went undetected as currently we are still not running PyPy CI (need to fix that).

To fix this issue, one would need to make a PR to https://github.com/tomMoral/loky to fix this. Once it is fixed there (or at least the tests pass), we can make another PR in scikit-learn and hot-fix it here.

Would you be interested in making that PR?

For the rest, you can try to run tests with,

pytest --continue-on-collection-errors [...]

to at least some test results meanwhile. For the time being you can also patch this import issue inside your docker container.

@giodegas
Copy link
Author

giodegas commented Sep 2, 2018

Opened issue 157 in loky.
I will do the manual fix locally , see if it works , I will post the potest.log then I will try to include in the docker container.
I will post un update here ASAP.

@giodegas
Copy link
Author

giodegas commented Sep 2, 2018

@rth tried with before the fix:

pytest --pyargs --continue-on-collection-errors sklearn > pytest.log 

got pytest.log

does not go much more forward.

@giodegas
Copy link
Author

giodegas commented Sep 2, 2018

after the fix pytest takes a much longer time, good sign!
Waiting for it..

head pytest.log
============================= test session starts ==============================
platform linux -- Python 3.5.3[pypy-6.0.0-final], pytest-3.7.4, py-1.6.0, pluggy-0.7.1
rootdir: /, inifile:
collected 10028 items

usr/local/site-packages/sklearn/cluster/tests/test_affinity_propagation.py . [  0%]
......                                                                   [  0%]
usr/local/site-packages/sklearn/cluster/tests/test_bicluster.py ........ [  0%]
s.                                                                       [  0%]
usr/local/site-packages/sklearn/cluster/tests/test_birch.py .......      [  0%]

@giodegas
Copy link
Author

giodegas commented Sep 2, 2018

got this pytest.log.

Appears ok, waiting for your comments.

@giodegas
Copy link
Author

giodegas commented Sep 2, 2018

applied the fix to the Dockerfile, updated on docker hub, pulled a fresh container locally and got this
pytest.log .

@rth
Copy link
Member

rth commented Sep 2, 2018

Thanks for the logs! Essentially parallel processing with the updated joblib 0.12 that uses loky doesn't seem to work in PyPy (I will open an issue about it). The rest is fine. It means that in this version when PyPy users use any estimator with n_jobs>1 it will fail.

A workaround could be to use the previous version of joblib,

pip install joblib==0.11
export SKLEARN_SITE_JOBLIB=true

# use sklearn or run tests here

that should fix the remaining issues.

@rth rth added the pypy label Sep 2, 2018
@giodegas
Copy link
Author

giodegas commented Sep 3, 2018

ok I will try, thanks.

@rth rth added this to the 0.20 milestone Sep 4, 2018
@rth
Copy link
Member

rth commented Sep 4, 2018

PyPy support was added to loky in https://github.com/tomMoral/loky/pull/161, marking this as blocker for the final 0.20 release to indicate that we would need to sync with the latest loky/joblib version once they are released. Also we need to re-add CI to prevent something like this from happening again.

@giodegas
Copy link
Author

giodegas commented Sep 4, 2018

Great! Let me know how I can help.

@rth
Copy link
Member

rth commented Sep 4, 2018

Well we would need to make a PR that,

Feel free to make the PR )

Once the CI green, we would then need to way for the joblib (inlcuding loky) to be updated in scikit-learn to the latest released version before merging it.

@tomMoral
Copy link
Contributor

tomMoral commented Sep 4, 2018

The loky version in the vendored joblib should not be changed in scikit-learn as it will lead to conflicts.
@ogrisel is currently preparing a release of joblib and a PR to synchronize it in sklearn.

@rth
Copy link
Member

rth commented Sep 4, 2018

The loky version in the vendored joblib should not be changed in scikit-learn as it will lead to conflicts.

Can't we just test it in a PR? Just updating loky there wouldn't work? To avoid a situation that for another minor patch to loky, we need to make joblib & loky releases..

@tomMoral
Copy link
Contributor

tomMoral commented Sep 4, 2018

To avoid a situation that for another minor patch to loky, we need to make joblib & loky releases..

This is the issue with nested vendoring but I don't think it is such a good idea to start to rely on modified version of joblib. The advantage of including it in joblib is that we can ensure it is stable enough.

If we want to be able to modify loky without joblib, a solution would be to vendor loky separately, to be able to say that we are using version X.X of joblib and Y.Y of loky.

That being said, as loky is the default backend in joblib, this does not shock me to do minor patch release of joblib following minor patch release of loky so the system as it is now is not so complicated.

@ogrisel ogrisel mentioned this issue Sep 5, 2018
@ogrisel
Copy link
Member

ogrisel commented Sep 5, 2018

@giodegas Can you please check that your use case work fine with joblib 0.12.4?

pip install joblib==0.12.4
export SKLEARN_SITE_JOBLIB=true

@jnothman
Copy link
Member

jnothman commented Sep 5, 2018

It might make sense to add pypy Travis at least to the 0.20.X branch, and perhaps on Cron to master

ogrisel added a commit that referenced this issue Sep 5, 2018
This should fix #11971 (fixed PyPy support, pypy3 is now part of the joblib build matrix on travis).

It should also be backported to 0.20.X.
@ogrisel
Copy link
Member

ogrisel commented Sep 5, 2018

It might make sense to add pypy Travis at least to the 0.20.X branch, and perhaps on Cron to master

You mean to run the scikit-learn tests with pypy on travis (or circle ci) in a cron job? I agree.

@ogrisel
Copy link
Member

ogrisel commented Sep 5, 2018

BTW, I am currently running the scikit-learn tests on master with pypy3 and so far (50%) they pass.

@ogrisel
Copy link
Member

ogrisel commented Sep 5, 2018

I got a core dump when running sklearn/tests/test_docstring_parameters.py but this is unrelated to joblib.

@giodegas
Copy link
Author

giodegas commented Sep 9, 2018

I did a new run of the rebuilt container, got this sklearn_test.log with the outcome:

  28 failed, 9903 passed, 28 skipped, 64 xfailed, 5 xpassed, 926 warnings in 11380.18 seconds 

@giodegas
Copy link
Author

giodegas commented Sep 9, 2018

It appears the same as previous runs. How can I get the latest of all, without applying patches?
Is this enough?

 pip install --pre scikit-learn

@rth
Copy link
Member

rth commented Sep 10, 2018

You would need to install from master,

pip install https://github.com/scikit-learn/scikit-learn/archive/master.zip

but for now there will be a few failures, mostly in doctests. I'm trying to fix the remaining ones in #12039

@giodegas
Copy link
Author

giodegas commented Sep 10, 2018

ok I will wait for the #12039 before testing again, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants