Flaky tests in logistic regression #8879

jl2922 · 2017-05-14T12:31:13Z

In current master branch

1de66a0

If you run the following commands several times

nosetests sklearn/linear_model/

Some tests related to comparing results between two methods fail randomly.

Instead of comparing with each other, maybe we can use the following test case:
x1 x2 y
1 2 0
2 3 0
3 5 1
4 7 0
5 11 1
6 13 1
For this case, I checked in two statistical tools, a professional one, R, and a casual one, google docs, that the result agrees with each other to the digits shown below:

,Coefficients,Standard Error
Intercept,-2.942920409,3.47036
X Variable 1,-0.9491680059,3.97308
X Variable 2,0.9755542466,1.89720

The text was updated successfully, but these errors were encountered:

jnothman · 2017-05-14T12:37:47Z

Which tests, precisely?

…

On 14 May 2017 at 22:31, Junhao Li ***@***.***> wrote: @glemaitre <https://github.com/glemaitre> #8872 <#8872> In current master branch 1de66a0 <1de66a0> If you run the following commands several times nosetests sklearn/linear_model/ Some tests related to comparing results between two methods fail randomly. Instead of comparing with each other, maybe we can use the following test case: x1 x2 y 1 2 0 2 3 0 3 5 1 4 7 0 5 11 1 6 13 1 For this case, I checked in two statistical tools, a professional one, R, and a casual one, google docs, that the result agrees with each other to the digits shown below: ,Coefficients,Standard Error Intercept,-2.942920409,3.47036 X Variable 1,-0.9491680059,3.97308 X Variable 2,0.9755542466 <(975)%20554-2466>,1.89720 — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#8879>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz65yFgtf0KySEQ-SIQdhHQCAxP-ryks5r5vQSgaJpZM4NaXSC> .

jl2922 · 2017-05-14T12:42:31Z

in the file test_logistic, the tests that compares the coef_ from two methods, like saga vs liblinear

glemaitre · 2017-05-14T12:42:56Z

And do you have a hint to achieve the failing condition. I tried to run 10 times but I did not succeed.

Also, if you can provide you system info:

import platform; print(platform.platform())
import sys; print("Python", sys.version)
import numpy; print("NumPy", numpy.__version__)
import scipy; print("SciPy", scipy.__version__)
import sklearn; print("Scikit-Learn", sklearn.__version__)

jl2922 · 2017-05-14T12:48:15Z

I see. It could be a dependency issue.

Darwin-16.5.0-x86_64-i386-64bit
('Python', '2.7.13 |Anaconda custom (x86_64)| (default, Dec 20 2016, 23:05:08) \n[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]')
('NumPy', '1.9.3')
('SciPy', '0.16.0')
('Scikit-Learn', '0.19.dev0')

jl2922 · 2017-05-14T12:56:51Z

I just updated some packages and still got fails.

Darwin-16.5.0-x86_64-i386-64bit
('Python', '2.7.13 |Anaconda custom (x86_64)| (default, Dec 20 2016, 23:05:08) \n[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]')
('NumPy', '1.12.1')
('SciPy', '0.19.0')
('Scikit-Learn', '0.19.dev0')

yanlin-duan · 2017-05-15T11:08:10Z

@jl2922 Confirmed that the same issue occurred on my side as well. test_saga_vs_liblinear fails ~70% of the time.

Below are my system info:

Darwin-16.5.0-x86_64-i386-64bit
Python 3.5.2 |Anaconda custom (x86_64)| (default, Jul 2 2016, 17:52:12)
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]
NumPy 1.11.3
SciPy 0.18.1
Scikit-Learn 0.19.dev0

I have also extracted out a failed case (data: failed_X and failed_y, fitted model: failed_saga and failed_liblinear) which may help to reproduce the error.

I am still looking at it, but at the first glimpse it seems a bit peculiar for saga solver to have coefficients as huge as -5.623e+177.

test_saga_vs_liblinear_fail_case.zip

glemaitre · 2017-05-15T12:32:51Z

My neighbor with the following config reproduce the same error.

Python 3.5.2 |Anaconda 4.3.1 (x86_64)| (default, Jul 2 2016, 17:52:12)
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)]
NumPy 1.11.3
SciPy 0.18.1
Scikit-Learn 0.19.dev0

@jnothman @ogrisel @lesteve it seems that there is something specific with Darwin and Anaconda

lesteve · 2017-05-23T13:04:20Z

Maybe some floating point difference? Maybe dependent on the linear algebra library that is being used?

@glemaitre can you kindly ask your neighbor to post the nosetests or pytest command to run one of the tests failing + the stacktrace?

glemaitre · 2017-05-23T13:46:21Z

@dengemann (he might to be away this week)

nosetests or pytest command to run one

I asked for nosetests sklearn/linear_model/tests/test_logistic.py.

dengemann · 2017-05-23T13:52:26Z

I'm back tomorrow. I can share the stack traces with you later.

…

On Tue, 23 May 2017 at 15:46, Guillaume Lemaitre ***@***.***> wrote: @dengemann <https://github.com/dengemann> (he might to be away this week) nosetests or pytest command to run one I asked for nosetests sklearn/linear_model/tests/test_logistic.py. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8879 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB0fiobwO88F_5KGvJAsEAxtrP2Ro2zmks5r8uM1gaJpZM4NaXSC> .

melgoetz · 2017-07-15T20:39:29Z

I just had this happen to me as well.

Darwin-16.6.0-x86_64-i386-64bit
Python 3.6.1 |Continuum Analytics, Inc.| (default, May 11 2017, 13:04:09)
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
NumPy 1.13.1
SciPy 0.19.1
Scikit-Learn 0.20.dev0

➭ nosetests sklearn/linear_model/tests/test_logistic.py
...................../Users/melaniegoetz/git/scikit-learn/sklearn/linear_model/sag.py:326: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
......iter  1 act 8.747e+00 pre 7.841e+00 delta 1.521e+00 f 1.386e+01 |g| 1.607e+01 CG   3
iter  2 act 9.618e-01 pre 8.065e-01 delta 1.521e+00 f 5.116e+00 |g| 3.834e+00 CG   2
./Users/melaniegoetz/git/scikit-learn/sklearn/linear_model/sag.py:326: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
.......F...../Users/melaniegoetz/git/scikit-learn/sklearn/linear_model/sag.py:326: ConvergenceWarning: The max_iter was reached which means the coef_ did not converge
  "the coef_ did not converge", ConvergenceWarning)
F.
======================================================================
FAIL: sklearn.linear_model.tests.test_logistic.test_logreg_l1
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/melaniegoetz/miniconda3/envs/sklearndev/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/melaniegoetz/git/scikit-learn/sklearn/linear_model/tests/test_logistic.py", line 930, in test_logreg_l1
    assert_array_almost_equal(lr_saga.coef_, lr_liblinear.coef_)
  File "/Users/melaniegoetz/miniconda3/envs/sklearndev/lib/python3.6/site-packages/numpy/testing/utils.py", line 962, in assert_array_almost_equal
    precision=decimal)
  File "/Users/melaniegoetz/miniconda3/envs/sklearndev/lib/python3.6/site-packages/numpy/testing/utils.py", line 778, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 6 decimals

(mismatch 100.0%)
 x: array([[ 16554.14284 ,  12638.782254,  -7736.068316,  15584.072418,
         -2219.832932, -69260.451715, -92176.31246 ,   7751.559091,
         -4640.8965  ,   7621.556785,  10650.712029,   2845.788903,...
 y: array([[ 0.      ,  0.      ,  0.395495,  0.      ,  0.      ,  0.      ,
         3.527385,  0.      ,  0.      ,  0.      , -0.239079,  0.      ,
         0.      , -0.271122,  0.      ,  0.      ,  0.190561,  0.      ,...
>>  raise AssertionError('\nArrays are not almost equal to 6 decimals\n\n(mismatch 100.0%)\n x: array([[ 16554.14284 ,  12638.782254,  -7736.068316,  15584.072418,\n         -2219.832932, -69260.451715, -92176.31246 ,   7751.559091,\n         -4640.8965  ,   7621.556785,  10650.712029,   2845.788903,...\n y: array([[ 0.      ,  0.      ,  0.395495,  0.      ,  0.      ,  0.      ,\n         3.527385,  0.      ,  0.      ,  0.      , -0.239079,  0.      ,\n         0.      , -0.271122,  0.      ,  0.      ,  0.190561,  0.      ,...')


======================================================================
FAIL: sklearn.linear_model.tests.test_logistic.test_saga_vs_liblinear
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/melaniegoetz/miniconda3/envs/sklearndev/lib/python3.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/Users/melaniegoetz/git/scikit-learn/sklearn/linear_model/tests/test_logistic.py", line 1144, in test_saga_vs_liblinear
    assert_array_almost_equal(saga.coef_, liblinear.coef_, 3)
  File "/Users/melaniegoetz/miniconda3/envs/sklearndev/lib/python3.6/site-packages/numpy/testing/utils.py", line 962, in assert_array_almost_equal
    precision=decimal)
  File "/Users/melaniegoetz/miniconda3/envs/sklearndev/lib/python3.6/site-packages/numpy/testing/utils.py", line 778, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 3 decimals

(mismatch 100.0%)
 x: array([[-1758.151, -6326.522,  9487.076,  3625.029]])
 y: array([[ 0.   , -1.06 ,  1.222,  0.   ]])
>>  raise AssertionError('\nArrays are not almost equal to 3 decimals\n\n(mismatch 100.0%)\n x: array([[-1758.151, -6326.522,  9487.076,  3625.029]])\n y: array([[ 0.   , -1.06 ,  1.222,  0.   ]])')


----------------------------------------------------------------------
Ran 43 tests in 4.795s

FAILED (failures=2)

TomDLT · 2017-07-17T09:02:27Z

This issue seems to be identical to #9351, solved in #9376.

I guess we can close it.

TomDLT closed this as completed Jul 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Flaky tests in logistic regression #8879

Flaky tests in logistic regression #8879

jl2922 commented May 14, 2017

jnothman commented May 14, 2017 via email

jl2922 commented May 14, 2017

glemaitre commented May 14, 2017

jl2922 commented May 14, 2017

jl2922 commented May 14, 2017

yanlin-duan commented May 15, 2017

glemaitre commented May 15, 2017

lesteve commented May 23, 2017

glemaitre commented May 23, 2017

dengemann commented May 23, 2017 via email

melgoetz commented Jul 15, 2017 •

edited

Loading

TomDLT commented Jul 17, 2017 •

edited

Loading

Flaky tests in logistic regression #8879

Flaky tests in logistic regression #8879

Comments

jl2922 commented May 14, 2017

jnothman commented May 14, 2017 via email

jl2922 commented May 14, 2017

glemaitre commented May 14, 2017

jl2922 commented May 14, 2017

jl2922 commented May 14, 2017

yanlin-duan commented May 15, 2017

glemaitre commented May 15, 2017

lesteve commented May 23, 2017

glemaitre commented May 23, 2017

dengemann commented May 23, 2017 via email

melgoetz commented Jul 15, 2017 • edited Loading

TomDLT commented Jul 17, 2017 • edited Loading

melgoetz commented Jul 15, 2017 •

edited

Loading

TomDLT commented Jul 17, 2017 •

edited

Loading