Skip to content

[MRG] MNT Re-enable PyPy CI #12039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Sep 18, 2018
Merged

[MRG] MNT Re-enable PyPy CI #12039

merged 11 commits into from
Sep 18, 2018

Conversation

rth
Copy link
Member

@rth rth commented Sep 8, 2018

Closes #11971 (though it's already closed)

This re-enables PyPy Circle CI build.

TODO

  • make sure tests pass
  • make it a Circle CI cron job

@@ -43,6 +44,7 @@
]


@fails_if_pypy # this segfaults with PyPy-6.0
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test it not relevant for scikit-learn users (it checks docstring formatting), so I think we can safely skip it in PyPy.

@@ -162,6 +163,7 @@ def test_multi_target_sample_weights():
classes = list(map(np.unique, (y1, y2, y3)))


@fails_if_pypy # FIXME
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails due to not being able to allocate memory somewhere in loky,

    def test_multi_output_classification_partial_fit_parallelism():
        sgd_linear_clf = SGDClassifier(loss='log', random_state=1, max_iter=5)
        mor = MultiOutputClassifier(sgd_linear_clf, n_jobs=-1)
>       mor.partial_fit(X, y, classes)

sklearn/tests/test_multioutput.py:168: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
sklearn/utils/metaestimators.py:118: in <lambda>
    out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
sklearn/multioutput.py:121: in partial_fit
    sample_weight, first_time) for i in range(y.shape[1]))
sklearn/externals/joblib/parallel.py:996: in __call__
    self.retrieve()
sklearn/externals/joblib/parallel.py:899: in retrieve
    self._output.extend(job.get(timeout=self.timeout))
pypy-env/lib_pypy/_functools.py:80: in __call__
    return self._func(*(self._args + fargs), **fkeywords)
sklearn/externals/joblib/_parallel_backends.py:517: in wrap_future_result
    return future.result(timeout=timeout)
/usr/local/lib-python/3/concurrent/futures/_base.py:405: in result
    return self.__get_result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <Future at 0x10052410 state=finished raised BrokenProcessPool>

    def __get_result(self):
        if self._exception:
>           raise self._exception
E           sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

/usr/local/lib-python/3/concurrent/futures/_base.py:357: BrokenProcessPool
----------------------------- Captured stderr call -----------------------------
/usr/local/lib-python/3/multiprocessing/semaphore_tracker.py:129: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
  len(cache))
Traceback (most recent call last):
  File "/usr/local/lib-python/3/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/local/lib-python/3/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/root/project/sklearn/__init__.py", line 64, in <module>
    from .base import clone
  File "/root/project/sklearn/base.py", line 13, in <module>
    from .utils.fixes import signature
  File "/root/project/sklearn/utils/__init__.py", line 13, in <module>
    from .validation import (as_float_array,
  File "/root/project/sklearn/utils/validation.py", line 27, in <module>
    from ..utils._joblib import Memory
  File "/root/project/sklearn/utils/_joblib.py", line 18, in <module>
    from ..externals.joblib import __all__   # noqa
  File "/root/project/sklearn/externals/joblib/__init__.py", line 119, in <module>
    from .parallel import Parallel
  File "/root/project/sklearn/externals/joblib/parallel.py", line 28, in <module>
    from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend,
  File "/root/project/sklearn/externals/joblib/_parallel_backends.py", line 22, in <module>
    from .executor import get_memmapping_executor
  File "/root/project/sklearn/externals/joblib/executor.py", line 14, in <module>
    from .externals.loky.reusable_executor import get_reusable_executor
  File "/root/project/sklearn/externals/joblib/externals/loky/__init__.py", line 11, in <module>
    from .backend.context import cpu_count
OSError: [Errno 12] Cannot allocate memory
Traceback (most recent call last):
  File "/usr/local/lib-python/3/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/local/lib-python/3/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/root/project/sklearn/__init__.py", line 64, in <module>
    from .base import clone
  File "/root/project/sklearn/base.py", line 13, in <module>
    from .utils.fixes import signature
  File "/root/project/sklearn/utils/__init__.py", line 13, in <module>
    from .validation import (as_float_array,
  File "/root/project/sklearn/utils/validation.py", line 27, in <module>
    from ..utils._joblib import Memory
  File "/root/project/sklearn/utils/_joblib.py", line 18, in <module>
    from ..externals.joblib import __all__   # noqa
  File "/root/project/sklearn/externals/joblib/__init__.py", line 119, in <module>
    from .parallel import Parallel
  File "/root/project/sklearn/externals/joblib/parallel.py", line 28, in <module>
    from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend,
  File "/root/project/sklearn/externals/joblib/_parallel_backends.py", line 20, in <module>
    from .pool import MemmappingPool
  File "/root/project/sklearn/externals/joblib/pool.py", line 33, in <module>
    from ._memmapping_reducer import get_memmapping_reducers
OSError: [Errno 12] Cannot allocate memory
Traceback (most recent call last):
  File "/usr/local/lib-python/3/runpy.py", line 183, in _run_module_as_main
    mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
  File "/usr/local/lib-python/3/runpy.py", line 109, in _get_module_details
    __import__(pkg_name)
  File "/root/project/sklearn/__init__.py", line 64, in <module>
    from .base import clone
  File "/root/project/sklearn/base.py", line 13, in <module>
    from .utils.fixes import signature
  File "/root/project/sklearn/utils/__init__.py", line 13, in <module>
    from .validation import (as_float_array,
  File "/root/project/sklearn/utils/validation.py", line 27, in <module>
    from ..utils._joblib import Memory
  File "/root/project/sklearn/utils/_joblib.py", line 18, in <module>
    from ..externals.joblib import __all__   # noqa
  File "/root/project/sklearn/externals/joblib/__init__.py", line 119, in <module>
    from .parallel import Parallel
  File "/root/project/sklearn/externals/joblib/parallel.py", line 28, in <module>
    from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend,
  File "/root/project/sklearn/externals/joblib/_parallel_backends.py", line 22, in <module>
    from .executor import get_memmapping_executor
  File "/root/project/sklearn/externals/joblib/executor.py", line 14, in <module>
    from .externals.loky.reusable_executor import get_reusable_executor
  File "/root/project/sklearn/externals/joblib/externals/loky/__init__.py", line 12, in <module>
    from .reusable_executor import get_reusable_executor
  File "/root/project/sklearn/externals/joblib/externals/loky/reusable_executor.py", line 11, in <module>
    from .process_executor import ProcessPoolExecutor, EXTRA_QUEUED_CALLS
  File "/root/project/sklearn/externals/joblib/externals/loky/process_executor.py", line 82, in <module>
    from .backend.utils import recursive_terminate
OSError: [Errno 12] Cannot allocate memory

will report this upstream.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it's the case on circle but n_jobs=-1 is dangerous on travis with pypy: the robust detection of the current number of cpu based on os.sched_getaffinity is not possible in pypy (not implemented). Instead joblib / loky fallsback to the basic cpu count detection that can report 32 cores on travis while on 2 are actually available.

Launching 32 pypy worker processes might exhaust the available memory of the worker.

Two solutions:

  • export LOKY_MAX_CPU_COUNT="2" in the pypy test setup
  • change the test itself to n_jobs=2

I think I prefer the LOKY_MAX_CPU_COUNT solution.

@@ -24,6 +25,7 @@ def test_get_deps_info():
assert 'pandas' in deps_info


@fails_if_pypy
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fails in numpy.distutils.system_info.get_info reported upstream in numpy/numpy#11968

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very strange. I cannot reproduce locally

>>>> import sys, numpy; print(numpy.__version__, sys.version)
1.14.3 3.5.3 (fdd60ed87e94, Apr 24 2018, 06:10:04)
[PyPy 6.0.0 with GCC 6.2.0 20160901]

@rth
Copy link
Member Author

rth commented Sep 17, 2018

Thanks for checking @mattip and for suggestions @ogrisel !

Yes, this is very strange. Locally all tests pass for me here, but it appears that in the failing tests in CI, we are somehow escaping from the PyPy virtualenv and end up using the standard library from system CPython. An example is below,

sklearn/utils/_show_versions.py:85: in _get_blas_info
    cblas_libs, blas_dict = get_blas_info()
sklearn/_build_utils/__init__.py:33: in get_blas_info
    blas_info = get_info('blas_opt', 0)
pypy-env/site-packages/numpy/distutils/system_info.py:433: in get_info
    return cl().get_info(notfound_action)
pypy-env/site-packages/numpy/distutils/system_info.py:625: in get_info
    self.calc_info()
pypy-env/site-packages/numpy/distutils/system_info.py:1632: in calc_info
    atlas_info = get_info('atlas_blas')
pypy-env/site-packages/numpy/distutils/system_info.py:433: in get_info
    return cl().get_info(notfound_action)
pypy-env/site-packages/numpy/distutils/system_info.py:625: in get_info
    self.calc_info()
pypy-env/site-packages/numpy/distutils/system_info.py:1226: in calc_info
    atlas_version, atlas_extra_info = get_atlas_version(**atlas)
pypy-env/site-packages/numpy/distutils/system_info.py:1459: in get_atlas_version
    c = cmd_config(Distribution())
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <numpy.distutils.command.config.config object at 0x0000000010c9e6e8>
dist = <distutils.dist.Distribution object at 0x0000000010c9e720>

    def __init__(self, dist):
        """Create and initialize a new Command object.  Most importantly,
            invokes the 'initialize_options()' method, which is the real
            initializer and depends on the actual command being
            instantiated.
            """
        # late import because of mutual dependence between these classes
        from distutils.dist import Distribution
    
        if not isinstance(dist, Distribution):
>           raise TypeError("dist must be a Distribution instance")
E           TypeError: dist must be a Distribution instance

/usr/local/lib-python/3/distutils/cmd.py:57: TypeError

where distutils.dist.Distribution appears to be coming from /usr/local/lib-python. Will try to see if using python -m pytest instead of pytest directly might help.

@rth rth force-pushed the pypy-ci branch 2 times, most recently from 60700ce to 826513d Compare September 17, 2018 13:56
@ogrisel
Copy link
Member

ogrisel commented Sep 17, 2018

You might need to deactivate the default virtualenv provided by circle ci in build_tools/circle/build_test_pypy.sh:

# deactivate circleci virtualenv and setup a pypy specific env instead
if [[ `type -t deactivate` ]]; then
  deactivate
fi

this is taken from the build_doc.sh script in the same folder.

It should be inserted before creating the pypy3 venv.

@rth
Copy link
Member Author

rth commented Sep 17, 2018

You might need to deactivate the default virtualenv provided by circle ci

Yes, but we are using the official PyPy docker image (https://hub.docker.com/_/pypy/) here, not the Circle CI one, and I don't think it includes its own virtualenv (cf Dockerfile).

Now I wonder if the issue cannot come from pytest-xdist used here to run tests in separate processes to catch segfaults. Will try to disable that.

@jnothman
Copy link
Member

I've not been following. There are green ticks. Is this good enough time release 0.20 with?

@rth rth changed the title [WIP] MNT Re-enable PyPy CI [MRG] MNT Re-enable PyPy CI Sep 17, 2018
@rth
Copy link
Member Author

rth commented Sep 17, 2018

Yes, I think it's good enough. Pushed a last commit to clean things up.

This is ready to be reviewed, the CI should pass.

For 0.20, it might be good to have this in it's current form, while on master add a cron job eventually. Should I make a separate PR for the cron job once this is merged and included in backports?

def test_docstring_parameters():
# Test module docstring formatting

if IS_PYPY:
raise SkipTest('this test segfaults on PyPy')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't use a mark like skipif?

@@ -65,6 +65,21 @@ jobs:
path: ~/log.txt
destination: log.txt

pypy3:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to do it on CRON in master, do that here, and make a separate PR to 0.20.X?

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, +1 for merge once @jnothman comments are dealt with.


pip install -e .
pip install -vv -e .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it still useful to keep the verbose mode here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still useful, because sometimes the install can take more than 10min, in which case CircleCI will kill the job if there is no output to stdout. It happened once in this PR.

@rth
Copy link
Member Author

rth commented Sep 18, 2018

Thanks for the reviews!

I addressed all the remaining comments, and updated .circleci/config.yaml to,

  • run the pypy3 job on the 0.20.X branch for each commit
  • run the pypy3 job on master daily using a cron job.

Note that it will not run the pypy3 job for PRs that target 0.20.X as the corresponding functionality is not yet implemented in Circle CI

This can be backported as it is to 0.20.X.

@rth
Copy link
Member Author

rth commented Sep 18, 2018

Note that it will not run the pypy3 job for PRs that target 0.20.X

Or maybe I misread the docs, and it will trigger it. I guess we will see..

@ogrisel
Copy link
Member

ogrisel commented Sep 18, 2018

LGTM, let's merge to see if the trigger works as expected on master. @jnothman I let you handle the backport in your PR.

@ogrisel ogrisel merged commit b91cbda into scikit-learn:master Sep 18, 2018
@rth rth deleted the pypy-ci branch September 18, 2018 21:34
@rth
Copy link
Member Author

rth commented Sep 19, 2018

Thanks! Looks like the pypy cron job did trigger on master as expected

https://circleci.com/gh/scikit-learn/scikit-learn/tree/master

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Sep 22, 2018
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Nov 29, 2018
* tag '0.20.0': (77 commits)
  ColumnTransformer generalization to work on empty lists (scikit-learn#12084)
  add sparse_threshold to make_columns_transformer (scikit-learn#12152)
  [MRG] Convert ColumnTransformer input list to numpy array (scikit-learn#12104)
  Change version to 0.20.0
  BUG: check equality instead of identity in check_cv (scikit-learn#12155)
  [MRG] Fix FutureWarnings in logistic regression examples (scikit-learn#12114)
  [MRG] Update test_metaestimators to pass y parameter when calling score (scikit-learn#12089)
  DOC Removed duplicated doc in tree.rst (scikit-learn#11922)
  [MRG] DOC covariance doctest examples (scikit-learn#12124)
  typo and formatting fixes in 0.20 doc (scikit-learn#11963)
  DOC Replaced the deprecated early_stopping parameter with n_iter_no_change. (scikit-learn#12133)
  [MRG +1] ColumnTransformer: store evaluated function column specifier during fit (scikit-learn#12107)
  Fix typo (scikit-learn#12126)
  DOC Typo in OneHotEncoder
  DOC Update fit_transform docstring of OneHotEncoder (scikit-learn#12117)
  DOC Removing quotes from variant names. (scikit-learn#12113)
  DOC BaggingRegressor missing default value for oob_score in docstring (scikit-learn#12108)
  [MRG] MNT Re-enable PyPy CI (scikit-learn#12039)
  MNT Only checks warnings on latest depedendencies versions in CI (scikit-learn#12048)
  TST Ignore warnings in common test to avoid collection errors (scikit-learn#12093)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Nov 29, 2018
* releases: (77 commits)
  ColumnTransformer generalization to work on empty lists (scikit-learn#12084)
  add sparse_threshold to make_columns_transformer (scikit-learn#12152)
  [MRG] Convert ColumnTransformer input list to numpy array (scikit-learn#12104)
  Change version to 0.20.0
  BUG: check equality instead of identity in check_cv (scikit-learn#12155)
  [MRG] Fix FutureWarnings in logistic regression examples (scikit-learn#12114)
  [MRG] Update test_metaestimators to pass y parameter when calling score (scikit-learn#12089)
  DOC Removed duplicated doc in tree.rst (scikit-learn#11922)
  [MRG] DOC covariance doctest examples (scikit-learn#12124)
  typo and formatting fixes in 0.20 doc (scikit-learn#11963)
  DOC Replaced the deprecated early_stopping parameter with n_iter_no_change. (scikit-learn#12133)
  [MRG +1] ColumnTransformer: store evaluated function column specifier during fit (scikit-learn#12107)
  Fix typo (scikit-learn#12126)
  DOC Typo in OneHotEncoder
  DOC Update fit_transform docstring of OneHotEncoder (scikit-learn#12117)
  DOC Removing quotes from variant names. (scikit-learn#12113)
  DOC BaggingRegressor missing default value for oob_score in docstring (scikit-learn#12108)
  [MRG] MNT Re-enable PyPy CI (scikit-learn#12039)
  MNT Only checks warnings on latest depedendencies versions in CI (scikit-learn#12048)
  TST Ignore warnings in common test to avoid collection errors (scikit-learn#12093)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Nov 29, 2018
* dfsg: (77 commits)
  ColumnTransformer generalization to work on empty lists (scikit-learn#12084)
  add sparse_threshold to make_columns_transformer (scikit-learn#12152)
  [MRG] Convert ColumnTransformer input list to numpy array (scikit-learn#12104)
  Change version to 0.20.0
  BUG: check equality instead of identity in check_cv (scikit-learn#12155)
  [MRG] Fix FutureWarnings in logistic regression examples (scikit-learn#12114)
  [MRG] Update test_metaestimators to pass y parameter when calling score (scikit-learn#12089)
  DOC Removed duplicated doc in tree.rst (scikit-learn#11922)
  [MRG] DOC covariance doctest examples (scikit-learn#12124)
  typo and formatting fixes in 0.20 doc (scikit-learn#11963)
  DOC Replaced the deprecated early_stopping parameter with n_iter_no_change. (scikit-learn#12133)
  [MRG +1] ColumnTransformer: store evaluated function column specifier during fit (scikit-learn#12107)
  Fix typo (scikit-learn#12126)
  DOC Typo in OneHotEncoder
  DOC Update fit_transform docstring of OneHotEncoder (scikit-learn#12117)
  DOC Removing quotes from variant names. (scikit-learn#12113)
  DOC BaggingRegressor missing default value for oob_score in docstring (scikit-learn#12108)
  [MRG] MNT Re-enable PyPy CI (scikit-learn#12039)
  MNT Only checks warnings on latest depedendencies versions in CI (scikit-learn#12048)
  TST Ignore warnings in common test to avoid collection errors (scikit-learn#12093)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PyPy in docker with Jupyter with 0.20.rc1
4 participants