-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG] MNT Re-enable PyPy CI #12039
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] MNT Re-enable PyPy CI #12039
Conversation
e21a1a3
to
16b9f76
Compare
@@ -43,6 +44,7 @@ | |||
] | |||
|
|||
|
|||
@fails_if_pypy # this segfaults with PyPy-6.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test it not relevant for scikit-learn users (it checks docstring formatting), so I think we can safely skip it in PyPy.
sklearn/tests/test_multioutput.py
Outdated
@@ -162,6 +163,7 @@ def test_multi_target_sample_weights(): | |||
classes = list(map(np.unique, (y1, y2, y3))) | |||
|
|||
|
|||
@fails_if_pypy # FIXME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fails due to not being able to allocate memory somewhere in loky,
def test_multi_output_classification_partial_fit_parallelism():
sgd_linear_clf = SGDClassifier(loss='log', random_state=1, max_iter=5)
mor = MultiOutputClassifier(sgd_linear_clf, n_jobs=-1)
> mor.partial_fit(X, y, classes)
sklearn/tests/test_multioutput.py:168:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
sklearn/utils/metaestimators.py:118: in <lambda>
out = lambda *args, **kwargs: self.fn(obj, *args, **kwargs)
sklearn/multioutput.py:121: in partial_fit
sample_weight, first_time) for i in range(y.shape[1]))
sklearn/externals/joblib/parallel.py:996: in __call__
self.retrieve()
sklearn/externals/joblib/parallel.py:899: in retrieve
self._output.extend(job.get(timeout=self.timeout))
pypy-env/lib_pypy/_functools.py:80: in __call__
return self._func(*(self._args + fargs), **fkeywords)
sklearn/externals/joblib/_parallel_backends.py:517: in wrap_future_result
return future.result(timeout=timeout)
/usr/local/lib-python/3/concurrent/futures/_base.py:405: in result
return self.__get_result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <Future at 0x10052410 state=finished raised BrokenProcessPool>
def __get_result(self):
if self._exception:
> raise self._exception
E sklearn.externals.joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
/usr/local/lib-python/3/concurrent/futures/_base.py:357: BrokenProcessPool
----------------------------- Captured stderr call -----------------------------
/usr/local/lib-python/3/multiprocessing/semaphore_tracker.py:129: UserWarning: semaphore_tracker: There appear to be 1 leaked semaphores to clean up at shutdown
len(cache))
Traceback (most recent call last):
File "/usr/local/lib-python/3/runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/local/lib-python/3/runpy.py", line 109, in _get_module_details
__import__(pkg_name)
File "/root/project/sklearn/__init__.py", line 64, in <module>
from .base import clone
File "/root/project/sklearn/base.py", line 13, in <module>
from .utils.fixes import signature
File "/root/project/sklearn/utils/__init__.py", line 13, in <module>
from .validation import (as_float_array,
File "/root/project/sklearn/utils/validation.py", line 27, in <module>
from ..utils._joblib import Memory
File "/root/project/sklearn/utils/_joblib.py", line 18, in <module>
from ..externals.joblib import __all__ # noqa
File "/root/project/sklearn/externals/joblib/__init__.py", line 119, in <module>
from .parallel import Parallel
File "/root/project/sklearn/externals/joblib/parallel.py", line 28, in <module>
from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend,
File "/root/project/sklearn/externals/joblib/_parallel_backends.py", line 22, in <module>
from .executor import get_memmapping_executor
File "/root/project/sklearn/externals/joblib/executor.py", line 14, in <module>
from .externals.loky.reusable_executor import get_reusable_executor
File "/root/project/sklearn/externals/joblib/externals/loky/__init__.py", line 11, in <module>
from .backend.context import cpu_count
OSError: [Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/usr/local/lib-python/3/runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/local/lib-python/3/runpy.py", line 109, in _get_module_details
__import__(pkg_name)
File "/root/project/sklearn/__init__.py", line 64, in <module>
from .base import clone
File "/root/project/sklearn/base.py", line 13, in <module>
from .utils.fixes import signature
File "/root/project/sklearn/utils/__init__.py", line 13, in <module>
from .validation import (as_float_array,
File "/root/project/sklearn/utils/validation.py", line 27, in <module>
from ..utils._joblib import Memory
File "/root/project/sklearn/utils/_joblib.py", line 18, in <module>
from ..externals.joblib import __all__ # noqa
File "/root/project/sklearn/externals/joblib/__init__.py", line 119, in <module>
from .parallel import Parallel
File "/root/project/sklearn/externals/joblib/parallel.py", line 28, in <module>
from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend,
File "/root/project/sklearn/externals/joblib/_parallel_backends.py", line 20, in <module>
from .pool import MemmappingPool
File "/root/project/sklearn/externals/joblib/pool.py", line 33, in <module>
from ._memmapping_reducer import get_memmapping_reducers
OSError: [Errno 12] Cannot allocate memory
Traceback (most recent call last):
File "/usr/local/lib-python/3/runpy.py", line 183, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/local/lib-python/3/runpy.py", line 109, in _get_module_details
__import__(pkg_name)
File "/root/project/sklearn/__init__.py", line 64, in <module>
from .base import clone
File "/root/project/sklearn/base.py", line 13, in <module>
from .utils.fixes import signature
File "/root/project/sklearn/utils/__init__.py", line 13, in <module>
from .validation import (as_float_array,
File "/root/project/sklearn/utils/validation.py", line 27, in <module>
from ..utils._joblib import Memory
File "/root/project/sklearn/utils/_joblib.py", line 18, in <module>
from ..externals.joblib import __all__ # noqa
File "/root/project/sklearn/externals/joblib/__init__.py", line 119, in <module>
from .parallel import Parallel
File "/root/project/sklearn/externals/joblib/parallel.py", line 28, in <module>
from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend,
File "/root/project/sklearn/externals/joblib/_parallel_backends.py", line 22, in <module>
from .executor import get_memmapping_executor
File "/root/project/sklearn/externals/joblib/executor.py", line 14, in <module>
from .externals.loky.reusable_executor import get_reusable_executor
File "/root/project/sklearn/externals/joblib/externals/loky/__init__.py", line 12, in <module>
from .reusable_executor import get_reusable_executor
File "/root/project/sklearn/externals/joblib/externals/loky/reusable_executor.py", line 11, in <module>
from .process_executor import ProcessPoolExecutor, EXTRA_QUEUED_CALLS
File "/root/project/sklearn/externals/joblib/externals/loky/process_executor.py", line 82, in <module>
from .backend.utils import recursive_terminate
OSError: [Errno 12] Cannot allocate memory
will report this upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it's the case on circle but n_jobs=-1
is dangerous on travis with pypy: the robust detection of the current number of cpu based on os.sched_getaffinity
is not possible in pypy (not implemented). Instead joblib / loky fallsback to the basic cpu count detection that can report 32 cores on travis while on 2 are actually available.
Launching 32 pypy worker processes might exhaust the available memory of the worker.
Two solutions:
export LOKY_MAX_CPU_COUNT="2"
in the pypy test setup- change the test itself to
n_jobs=2
I think I prefer the LOKY_MAX_CPU_COUNT
solution.
@@ -24,6 +25,7 @@ def test_get_deps_info(): | |||
assert 'pandas' in deps_info | |||
|
|||
|
|||
@fails_if_pypy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fails in numpy.distutils.system_info.get_info
reported upstream in numpy/numpy#11968
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very strange. I cannot reproduce locally
>>>> import sys, numpy; print(numpy.__version__, sys.version)
1.14.3 3.5.3 (fdd60ed87e94, Apr 24 2018, 06:10:04)
[PyPy 6.0.0 with GCC 6.2.0 20160901]
Thanks for checking @mattip and for suggestions @ogrisel ! Yes, this is very strange. Locally all tests pass for me here, but it appears that in the failing tests in CI, we are somehow escaping from the PyPy virtualenv and end up using the standard library from system CPython. An example is below,
where |
60700ce
to
826513d
Compare
You might need to deactivate the default virtualenv provided by circle ci in # deactivate circleci virtualenv and setup a pypy specific env instead
if [[ `type -t deactivate` ]]; then
deactivate
fi this is taken from the build_doc.sh script in the same folder. It should be inserted before creating the pypy3 venv. |
Yes, but we are using the official PyPy docker image (https://hub.docker.com/_/pypy/) here, not the Circle CI one, and I don't think it includes its own virtualenv (cf Dockerfile). Now I wonder if the issue cannot come from |
I've not been following. There are green ticks. Is this good enough time release 0.20 with? |
Yes, I think it's good enough. Pushed a last commit to clean things up. This is ready to be reviewed, the CI should pass. For 0.20, it might be good to have this in it's current form, while on master add a cron job eventually. Should I make a separate PR for the cron job once this is merged and included in backports? |
def test_docstring_parameters(): | ||
# Test module docstring formatting | ||
|
||
if IS_PYPY: | ||
raise SkipTest('this test segfaults on PyPy') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can't use a mark like skipif?
@@ -65,6 +65,21 @@ jobs: | |||
path: ~/log.txt | |||
destination: log.txt | |||
|
|||
pypy3: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to do it on CRON in master, do that here, and make a separate PR to 0.20.X?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, +1 for merge once @jnothman comments are dealt with.
|
||
pip install -e . | ||
pip install -vv -e . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it still useful to keep the verbose mode here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's still useful, because sometimes the install can take more than 10min, in which case CircleCI will kill the job if there is no output to stdout. It happened once in this PR.
Thanks for the reviews! I addressed all the remaining comments, and updated
Note that it will not run the pypy3 job for PRs that target This can be backported as it is to |
Or maybe I misread the docs, and it will trigger it. I guess we will see.. |
LGTM, let's merge to see if the trigger works as expected on master. @jnothman I let you handle the backport in your PR. |
Thanks! Looks like the pypy cron job did trigger on master as expected https://circleci.com/gh/scikit-learn/scikit-learn/tree/master |
* tag '0.20.0': (77 commits) ColumnTransformer generalization to work on empty lists (scikit-learn#12084) add sparse_threshold to make_columns_transformer (scikit-learn#12152) [MRG] Convert ColumnTransformer input list to numpy array (scikit-learn#12104) Change version to 0.20.0 BUG: check equality instead of identity in check_cv (scikit-learn#12155) [MRG] Fix FutureWarnings in logistic regression examples (scikit-learn#12114) [MRG] Update test_metaestimators to pass y parameter when calling score (scikit-learn#12089) DOC Removed duplicated doc in tree.rst (scikit-learn#11922) [MRG] DOC covariance doctest examples (scikit-learn#12124) typo and formatting fixes in 0.20 doc (scikit-learn#11963) DOC Replaced the deprecated early_stopping parameter with n_iter_no_change. (scikit-learn#12133) [MRG +1] ColumnTransformer: store evaluated function column specifier during fit (scikit-learn#12107) Fix typo (scikit-learn#12126) DOC Typo in OneHotEncoder DOC Update fit_transform docstring of OneHotEncoder (scikit-learn#12117) DOC Removing quotes from variant names. (scikit-learn#12113) DOC BaggingRegressor missing default value for oob_score in docstring (scikit-learn#12108) [MRG] MNT Re-enable PyPy CI (scikit-learn#12039) MNT Only checks warnings on latest depedendencies versions in CI (scikit-learn#12048) TST Ignore warnings in common test to avoid collection errors (scikit-learn#12093) ...
* releases: (77 commits) ColumnTransformer generalization to work on empty lists (scikit-learn#12084) add sparse_threshold to make_columns_transformer (scikit-learn#12152) [MRG] Convert ColumnTransformer input list to numpy array (scikit-learn#12104) Change version to 0.20.0 BUG: check equality instead of identity in check_cv (scikit-learn#12155) [MRG] Fix FutureWarnings in logistic regression examples (scikit-learn#12114) [MRG] Update test_metaestimators to pass y parameter when calling score (scikit-learn#12089) DOC Removed duplicated doc in tree.rst (scikit-learn#11922) [MRG] DOC covariance doctest examples (scikit-learn#12124) typo and formatting fixes in 0.20 doc (scikit-learn#11963) DOC Replaced the deprecated early_stopping parameter with n_iter_no_change. (scikit-learn#12133) [MRG +1] ColumnTransformer: store evaluated function column specifier during fit (scikit-learn#12107) Fix typo (scikit-learn#12126) DOC Typo in OneHotEncoder DOC Update fit_transform docstring of OneHotEncoder (scikit-learn#12117) DOC Removing quotes from variant names. (scikit-learn#12113) DOC BaggingRegressor missing default value for oob_score in docstring (scikit-learn#12108) [MRG] MNT Re-enable PyPy CI (scikit-learn#12039) MNT Only checks warnings on latest depedendencies versions in CI (scikit-learn#12048) TST Ignore warnings in common test to avoid collection errors (scikit-learn#12093) ...
* dfsg: (77 commits) ColumnTransformer generalization to work on empty lists (scikit-learn#12084) add sparse_threshold to make_columns_transformer (scikit-learn#12152) [MRG] Convert ColumnTransformer input list to numpy array (scikit-learn#12104) Change version to 0.20.0 BUG: check equality instead of identity in check_cv (scikit-learn#12155) [MRG] Fix FutureWarnings in logistic regression examples (scikit-learn#12114) [MRG] Update test_metaestimators to pass y parameter when calling score (scikit-learn#12089) DOC Removed duplicated doc in tree.rst (scikit-learn#11922) [MRG] DOC covariance doctest examples (scikit-learn#12124) typo and formatting fixes in 0.20 doc (scikit-learn#11963) DOC Replaced the deprecated early_stopping parameter with n_iter_no_change. (scikit-learn#12133) [MRG +1] ColumnTransformer: store evaluated function column specifier during fit (scikit-learn#12107) Fix typo (scikit-learn#12126) DOC Typo in OneHotEncoder DOC Update fit_transform docstring of OneHotEncoder (scikit-learn#12117) DOC Removing quotes from variant names. (scikit-learn#12113) DOC BaggingRegressor missing default value for oob_score in docstring (scikit-learn#12108) [MRG] MNT Re-enable PyPy CI (scikit-learn#12039) MNT Only checks warnings on latest depedendencies versions in CI (scikit-learn#12048) TST Ignore warnings in common test to avoid collection errors (scikit-learn#12093) ...
Closes #11971 (though it's already closed)
This re-enables PyPy Circle CI build.
TODO