Skip to content

MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch #11899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 25, 2018

Conversation

rth
Copy link
Member

@rth rth commented Aug 23, 2018

Workaround for the fixture reported in MacPython/scikit-learn-wheels#7 (comment) also relevant to #11878

  • Exposes a sklearn.utils.IS_32BIT flag
  • Increases the tolerance for 32 bit tests in the last assertion of test_logistic::test_dtype_match. Also parametrize the test and replace assert_almost_equal with assert_allclose.

I'm not sure the increased tolerance is sufficient yet (need to check all logs from MacPython/scikit-learn-wheels#7)

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right fix? If we're trying to check that float32 and float64 input result in the same model, surely we don't want the processor to matter insofar as possible. Is there a way to identify where in the processing the architecture is affecting?

@rth rth force-pushed the fix-logregr-failure-32bit branch from ab901cd to c6fcb60 Compare August 23, 2018 07:33
@rth
Copy link
Member Author

rth commented Aug 23, 2018

This error occurs only on 32 bit Windows with Python 3.5, 3.6 but not 2.7, 3.4 and 3.7 (cf https://ci.appveyor.com/project/sklearn-wheels/scikit-learn-wheels/build/1.0.66). Linux and Mac OS are fine, and I can confirm that I can't reproduce this on Debian unstable 32 bit. Not sure how one could debug this in this case. Maybe some issues with MKL in those versions? If it was a fundamental issue in our code, I would hope we would have seen it on other platforms / python versions as well.

I think relaxing the accuracy for Windows 32 bit could be reasonable (although not ideal) workaround to make the RC happen. Updated the check to be only taken into account on 32 bit Windows.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But let's create an issue to investigate?

@amueller
Copy link
Member

Do we want this to be public API? maybe keep it private? +1 on trying to fix this way but opening an issue to investigate.

@rth
Copy link
Member Author

rth commented Aug 24, 2018

Thanks for the reviews! Made the flag public private and opened the corresponding issue #11908 . Will merge when CI is green.

@rth
Copy link
Member Author

rth commented Aug 24, 2018

Made the flag public

I mean to say private : sklearn.utils._IS_32BIT

Circle CI fails for python 2 in examples/model_selection/grid_search_text_feature_extraction.py due to

BrokenProcessPool: A process in the executor was terminated abruptly while the future was running or pending.

Reported the issue upstream in https://github.com/tomMoral/loky/issues/151.

This is failure is unrelated to this PR, and the corresponding example is only run because I'm using my own CircleCI setup. I would suggest merging despite it.

Copy link
Member

@qinhanmin2014 qinhanmin2014 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. When we have time, it might be better to investigate all the skipped tests and add a comment to explain why we skip these tests (under certain circumstances)

@qinhanmin2014 qinhanmin2014 merged commit b42a515 into scikit-learn:master Aug 25, 2018
@rth rth deleted the fix-logregr-failure-32bit branch August 25, 2018 10:59
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Sep 10, 2018
* tag '0.20rc1': (1109 commits)
  MNT rc version
  DOC Release dates for 0.20 (scikit-learn#11838)
  DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937)
  FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936)
  MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896)
  [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916)
  ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905)
  MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899)
  DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907)
  DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910)
  Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870)
  MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894)
  EXA set figure size to avoid overlaps (scikit-learn#11889)
  MRG/REL fixes /skips for 32bit tests (scikit-learn#11879)
  add durations=20 to makefile to show test runtimes locally (scikit-learn#11147)
  DOC loss='l2' is no longer accpeted in l1_min_c
  DOC add note about brute force nearest neighbors for string data (scikit-learn#11884)
  DOC Change sign of energy in RBM (scikit-learn#11156)
  RFC try to warn on iid less often (scikit-learn#11613)
  DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Sep 10, 2018
* releases: (1109 commits)
  MNT rc version
  DOC Release dates for 0.20 (scikit-learn#11838)
  DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937)
  FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936)
  MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896)
  [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916)
  ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905)
  MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899)
  DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907)
  DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910)
  Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870)
  MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894)
  EXA set figure size to avoid overlaps (scikit-learn#11889)
  MRG/REL fixes /skips for 32bit tests (scikit-learn#11879)
  add durations=20 to makefile to show test runtimes locally (scikit-learn#11147)
  DOC loss='l2' is no longer accpeted in l1_min_c
  DOC add note about brute force nearest neighbors for string data (scikit-learn#11884)
  DOC Change sign of energy in RBM (scikit-learn#11156)
  RFC try to warn on iid less often (scikit-learn#11613)
  DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664)
  ...
yarikoptic added a commit to yarikoptic/scikit-learn that referenced this pull request Sep 10, 2018
* dfsg: (1109 commits)
  MNT rc version
  DOC Release dates for 0.20 (scikit-learn#11838)
  DOC Fix: require n_splits > 1 in TimeSeriesSplit (scikit-learn#11937)
  FIX xfail for MacOS LogisticRegressionCV stability (scikit-learn#11936)
  MNT: Use GEMV in enet_coordinate_descent (Pt. 1) (scikit-learn#11896)
  [MRG] TST/FIX stop optics reachability failure on 32bit (scikit-learn#11916)
  ENH add multi_class='auto' for LogisticRegression, default from 0.22; default solver will be 'lbfgs' (scikit-learn#11905)
  MAINT Fix test_logistic::test_dtype_match failure on 32 bit arch (scikit-learn#11899)
  DOC Updated link to Laurens van der Maaten's home page (scikit-learn#11907)
  DOC Remove stray backtick in /doc/modules/feature_extraction.rst (scikit-learn#11910)
  Deprecate min_samples_leaf and min_weight_fraction_leaf (scikit-learn#11870)
  MNT modify test_sparse_oneclasssvm to be parametrized (scikit-learn#11894)
  EXA set figure size to avoid overlaps (scikit-learn#11889)
  MRG/REL fixes /skips for 32bit tests (scikit-learn#11879)
  add durations=20 to makefile to show test runtimes locally (scikit-learn#11147)
  DOC loss='l2' is no longer accpeted in l1_min_c
  DOC add note about brute force nearest neighbors for string data (scikit-learn#11884)
  DOC Change sign of energy in RBM (scikit-learn#11156)
  RFC try to warn on iid less often (scikit-learn#11613)
  DOC reduce plot_gpr_prior_posterior.py warnings(scikit-learn#11664)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants