[WIP] Common test for equivalence between sparse and dense matrices. #7590

maniteja123 · 2016-10-06T05:47:03Z

What does this implement/fix? Explain your changes.

The common test is to ensure results on sparse and dense matrices are identical

Any other comments?

Some failing tests in estimators of linear_model all of which have separate code paths for sparse and dense matrices. I am not aware of any difference with its working on sparse and dense matrices. Please do let me know if there is any reason for this. I too will try to look into the code path. Thanks.

Notes

In neighbors module, the algorithm is set to brute for sparse matrices, but it is auto by default. So, I have set the param algorithm to brute so that the tests pass.

amueller · 2016-10-07T17:07:37Z

In the neighbors module, the tests should pass also with auto....

amueller · 2016-10-07T17:09:33Z

That the linear model classification outputs are different are a bit concerning to me... For SGD, maybe increasing the number of iterations might help. I'm confused by Perceptron and PassiveAgressive having different results... hum

jnothman · 2016-10-08T10:13:04Z

Might not get neibhbors passing with auto without merfging #5596

On 8 October 2016 at 04:09, Andreas Mueller notifications@github.com
wrote:

That the linear model classification outputs are different are a bit
concerning to me... For SGD, maybe increasing the number of iterations
might help. I'm confused by Perceptron and PassiveAgressive having
different results... hum

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#7590 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz63muRc2OAtdE7xQUCYvu7znOcxFVks5qxnzPgaJpZM4KPko3
.

jnothman · 2016-10-10T23:07:27Z

@maniteja123 please change to [MRG] when you want reviews

maniteja123 · 2016-10-14T15:11:26Z

Hi, I have tried to debug the reason for the failure for linear_model but was unable to get any insight. I would appreciate any leads into debugging this. Thanks.

jnothman · 2016-10-16T07:08:00Z

you can sometimes get faster feedback by copying the error message in here

maniteja123 · 2016-10-16T07:21:03Z

FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(LinearRegression)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 85.0%)
 x: array([ 0.96674273,  1.22343467,  0.44977076,  1.55090581,  1.39961997,
        1.00263803,  1.91457316,  0.68037063,  1.81105886,  1.53250171,
        1.80794754,  1.52524137,  1.01560744,  1.39961997,  0.54757601,...
 y: array([ 0.96795674,  1.25113222,  0.39973542,  1.49745071,  1.225     ,
        0.92083229,  1.87924544,  0.66670258,  1.70889972,  1.46430677,
        1.79683369,  1.55457563,  0.93076946,  1.225     ,  0.67464672,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 85.0%)\n x: array([ 0.96674273,  1.22343467,  0.44977076,  1.55090581,  1.39961997,\n        1.00263803,  1.91457316,  0.68037063,  1.81105886,  1.53250171,\n        1.80794754,  1.52524137,  1.01560744,  1.39961997,  0.54757601,...\n y: array([ 0.96795674,  1.25113222,  0.39973542,  1.49745071,  1.225     ,\n        0.92083229,  1.87924544,  0.66670258,  1.70889972,  1.46430677,\n        1.79683369,  1.55457563,  0.93076946,  1.225     ,  0.67464672,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(PassiveAggressiveClassifier)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 20.0%)
 x: array([1, 0, 0, 2, 2, 1, 2, 0, 2, 2, 2, 2, 1, 2, 0, 1, 0, 1, 0, 2, 2, 0, 2,
       2, 0, 2, 2, 0, 2, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 2])
 y: array([1, 3, 0, 2, 2, 1, 3, 0, 2, 2, 3, 0, 1, 2, 0, 1, 0, 1, 0, 3, 2, 0, 2,
       2, 0, 2, 3, 0, 3, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 20.0%)\n x: array([1, 0, 0, 2, 2, 1, 2, 0, 2, 2, 2, 2, 1, 2, 0, 1, 0, 1, 0, 2, 2, 0, 2,\n       2, 0, 2, 2, 0, 2, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 2])\n y: array([1, 3, 0, 2, 2, 1, 3, 0, 2, 2, 3, 0, 1, 2, 0, 1, 0, 1, 0, 3, 2, 0, 2,\n       2, 0, 2, 3, 0, 3, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(PassiveAggressiveRegressor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 100.0%)
 x: array([ 0.68679362,  1.16641719,  0.01746298,  1.46664109,  0.96385265,
        1.19200123,  1.11207778,  1.08051501,  0.57213576,  1.40547629,
        1.92124901, -0.53745013,  1.18454761,  0.96385265,  0.32975875,...
 y: array([ 1.6564933 ,  1.64894692, -1.20160109,  2.03978364,  0.3009845 ,
        0.84102172,  1.95448014, -0.22068228,  1.34525846,  1.82825671,
        2.16418556,  1.39318311,  0.82337868,  0.3009845 ,  0.09264704,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 100.0%)\n x: array([ 0.68679362,  1.16641719,  0.01746298,  1.46664109,  0.96385265,\n        1.19200123,  1.11207778,  1.08051501,  0.57213576,  1.40547629,\n        1.92124901, -0.53745013,  1.18454761,  0.96385265,  0.32975875,...\n y: array([ 1.6564933 ,  1.64894692, -1.20160109,  2.03978364,  0.3009845 ,\n        0.84102172,  1.95448014, -0.22068228,  1.34525846,  1.82825671,\n        2.16418556,  1.39318311,  0.82337868,  0.3009845 ,  0.09264704,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(Perceptron)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 42.5%)
 x: array([1, 0, 0, 2, 0, 0, 2, 0, 2, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 3, 2, 0, 1,
       0, 0, 0, 3, 0, 0, 1, 0, 2, 0, 0, 2, 2, 0, 0, 0, 2])
 y: array([1, 0, 0, 2, 2, 1, 2, 0, 2, 2, 0, 2, 1, 2, 0, 0, 0, 1, 0, 0, 2, 0, 2,
       2, 0, 2, 0, 0, 0, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 42.5%)\n x: array([1, 0, 0, 2, 0, 0, 2, 0, 2, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 3, 2, 0, 1,\n       0, 0, 0, 3, 0, 0, 1, 0, 2, 0, 0, 2, 2, 0, 0, 0, 2])\n y: array([1, 0, 0, 2, 2, 1, 2, 0, 2, 2, 0, 2, 1, 2, 0, 0, 0, 1, 0, 0, 2, 0, 2,\n       2, 0, 2, 0, 0, 0, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(RANSACRegressor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 100.0%)
 x: array([-1.56293319,  2.8987167 ,  0.1957102 , -0.13742639,  0.27718436,
        0.7128559 ,  1.29639132,  2.12392524,  2.25118566, -0.08698851,
        4.48011825,  0.92483196,  0.6986225 ,  0.27718436,  2.68786926,...
 y: array([ 0.99502537,  1.79343459, -0.23309143,  1.33439117,  1.        ,
        0.57204012,  0.94177303, -0.28714216,  1.50324173,  1.2937121 ,
        2.18721045,  1.59536532,  0.58602159,  1.        , -0.17576423,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 100.0%)\n x: array([-1.56293319,  2.8987167 ,  0.1957102 , -0.13742639,  0.27718436,\n        0.7128559 ,  1.29639132,  2.12392524,  2.25118566, -0.08698851,\n        4.48011825,  0.92483196,  0.6986225 ,  0.27718436,  2.68786926,...\n y: array([ 0.99502537,  1.79343459, -0.23309143,  1.33439117,  1.        ,\n        0.57204012,  0.94177303, -0.28714216,  1.50324173,  1.2937121 ,\n        2.18721045,  1.59536532,  0.58602159,  1.        , -0.17576423,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(Ridge)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 65.0%)
 x: array([ 1.02180472,  1.2303366 ,  0.58611176,  1.48164789,  1.3768671 ,
        1.05220481,  1.71579798,  0.75538989,  1.72506866,  1.46890119,
        1.69111745,  1.42737327,  1.06281154,  1.3768671 ,  0.64343746,...
 y: array([ 1.01949709,  1.23436092,  0.57412104,  1.46699924,  1.33766893,
        1.03115678,  1.70541306,  0.74877941,  1.70222799,  1.45126606,
        1.68622629,  1.43243801,  1.04117054,  1.33766893,  0.66990505,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 65.0%)\n x: array([ 1.02180472,  1.2303366 ,  0.58611176,  1.48164789,  1.3768671 ,\n        1.05220481,  1.71579798,  0.75538989,  1.72506866,  1.46890119,\n        1.69111745,  1.42737327,  1.06281154,  1.3768671 ,  0.64343746,...\n y: array([ 1.01949709,  1.23436092,  0.57412104,  1.46699924,  1.33766893,\n        1.03115678,  1.70541306,  0.74877941,  1.70222799,  1.45126606,\n        1.68622629,  1.43243801,  1.04117054,  1.33766893,  0.66990505,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(RidgeCV)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 90.0%)
 x: array([ 1.16997956,  1.23686115,  0.96742657,  1.33232525,  1.29261239,
        1.16198637,  1.38337165,  1.01153436,  1.43163852,  1.32749414,
        1.40027198,  1.27939317,  1.16625393,  1.29261239,  0.97474283,...
 y: array([ 1.14077769,  1.21801566,  0.93213681,  1.28826936,  1.225     ,
        1.1087318 ,  1.34574413,  0.97518746,  1.38431725,  1.28057257,
        1.37047542,  1.26482089,  1.11253029,  1.225     ,  0.98562192,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 90.0%)\n x: array([ 1.16997956,  1.23686115,  0.96742657,  1.33232525,  1.29261239,\n        1.16198637,  1.38337165,  1.01153436,  1.43163852,  1.32749414,\n        1.40027198,  1.27939317,  1.16625393,  1.29261239,  0.97474283,...\n y: array([ 1.14077769,  1.21801566,  0.93213681,  1.28826936,  1.225     ,\n        1.1087318 ,  1.34574413,  0.97518746,  1.38431725,  1.28057257,\n        1.37047542,  1.26482089,  1.11253029,  1.225     ,  0.98562192,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(RidgeClassifierCV)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 5.0%)
 x: array([1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0])
 y: array([1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1,
       0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0])
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 5.0%)\n x: array([1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,\n       0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0])\n y: array([1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1,\n       0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0])')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(SGDClassifier)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 20.0%)
 x: array([1, 1, 0, 2, 2, 1, 2, 2, 2, 2, 3, 2, 2, 2, 0, 1, 0, 1, 0, 3, 2, 2, 1,
       2, 0, 0, 3, 0, 2, 1, 0, 2, 2, 2, 2, 2, 0, 1, 0, 0])
 y: array([1, 3, 0, 2, 2, 1, 2, 0, 2, 2, 3, 2, 1, 2, 0, 0, 0, 1, 0, 3, 2, 0, 1,
       2, 0, 2, 3, 0, 0, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 20.0%)\n x: array([1, 1, 0, 2, 2, 1, 2, 2, 2, 2, 3, 2, 2, 2, 0, 1, 0, 1, 0, 3, 2, 2, 1,\n       2, 0, 0, 3, 0, 2, 1, 0, 2, 2, 2, 2, 2, 0, 1, 0, 0])\n y: array([1, 3, 0, 2, 2, 1, 2, 0, 2, 2, 3, 2, 1, 2, 0, 0, 0, 1, 0, 3, 2, 0, 1,\n       2, 0, 2, 3, 0, 0, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(SGDRegressor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 100.0%)
 x: array([ 0.74089592,  0.82438143,  0.66343549,  0.69959472,  0.57126915,
        0.602163  ,  0.75274869,  0.65297073,  0.71555488,  0.68398377,
        0.80443575,  0.86861232,  0.6011537 ,  0.57126915,  0.9226158 ,...
 y: array([ 0.24967165,  0.3610464 ,  0.16688444,  0.17912751,  0.00778348,
        0.06265229,  0.24860046,  0.15001561,  0.19515826,  0.15828331,
        0.31979999,  0.41888435,  0.06085973,  0.00778348,  0.5256354 ,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 100.0%)\n x: array([ 0.74089592,  0.82438143,  0.66343549,  0.69959472,  0.57126915,\n        0.602163  ,  0.75274869,  0.65297073,  0.71555488,  0.68398377,\n        0.80443575,  0.86861232,  0.6011537 ,  0.57126915,  0.9226158 ,...\n y: array([ 0.24967165,  0.3610464 ,  0.16688444,  0.17912751,  0.00778348,\n        0.06265229,  0.24860046,  0.15001561,  0.19515826,  0.15828331,\n        0.31979999,  0.41888435,  0.06085973,  0.00778348,  0.5256354 ,...')

maniteja123 · 2016-10-16T07:23:07Z

Hi Joel, sorry I didn't paste it because it would send lengthy notifications to everyone. Will remember it from next time. As for the errors, most of them are non deterministic in terms of mismatch( I suppose it is due to random seed), but these 10 estimators are failing always.

jnothman · 2016-10-18T09:40:19Z

Well, it isn't necessary to post the entirety of them, but giving examples
and your suspicions can help. Sorry I don't have time right now to
investigate.

On 16 October 2016 at 18:23, Maniteja Nandana notifications@github.com
wrote:

Hi Joel, sorry I didn't paste it because it would send lengthy
notifications to everyone. Will remember it from next time. As for the
errors, most of them are non deterministic in terms of mismatch( I suppose
it is due to random seed), but these 10 estimators are failing always.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#7590 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz69m-TTUw5baKKJtVcUYt5QKsub9xks5q0dDdgaJpZM4KPko3
.

jnothman · 2016-10-18T09:43:47Z

Can you try set fit_intercept=False by default? this may have something to do with _preprocess_data

maniteja123 · 2016-10-19T06:48:15Z

Thanks @jnothman for the suggestion. I did try to set fit_intercept = False but the tests are still failing.

jnothman · 2016-10-19T13:15:13Z

sklearn/utils/estimator_checks.py

+                      % name)
+                raise
+        except Exception:
+            print("Estimator %s doesn't seem to fail gracefully on "


(This is being output when there's an AssertionError, which is a bit yuck)

jnothman · 2016-10-19T23:03:02Z

A lot of the linear models have different paths for sparse and dense data. For example, where Ridge's solver='auto' it selects between 'cholesky' and 'sparse_cg' by the presence of sparse data or sample weight.

We might not be able to solve this problem. Ideally, we might want to show that sparse and dense data produce the same results under some parameter configuration, rather than the default. This, however, does not fit neatly in our common tests framework.

lesteve · 2016-11-15T13:27:12Z

We might not be able to solve this problem. Ideally, we might want to show that sparse and dense data produce the same results under some parameter configuration, rather than the default. This, however, does not fit neatly in our common tests framework.

Can you not have a list of per-class parameters when the default parameters are not appropriate to compare outputs with dense and sparse input matrices, i.e. something like:

dense_vs_sparse_additional_params = {
    'Ridge': {'solver': 'cholesky'}
    ...
}

Then you would use dense_vs_sparse_additional_params to build the estimators. If the class does not appear in dense_vs_sparse_additional_params then you just use the default parameters.

jnothman · 2016-11-16T02:13:09Z

That solution is okay, @lesteve, while we still don't really have solutions for extensible common testing.

wdevazelhes · 2019-02-25T11:01:38Z

I'd like to take this issue for the sprint if no one is working on it

jnothman · 2019-04-16T02:57:32Z

Closing in preference for #13246. Also untagging with milestone, in preference for tagging the issue (#1572)

maniteja123 added 2 commits October 4, 2016 23:24

Add test for sparse and dense equivalence

a22ee34

Set random state

17b4da6

RPGOne approved these changes Oct 10, 2016

View reviewed changes

maniteja123 mentioned this pull request Oct 11, 2016

[MRG] Fix NearestNeighbors algorithm='auto' to work with all supported metrics by default #5596

Closed

amueller added this to the 0.19 milestone Oct 17, 2016

jnothman reviewed Oct 19, 2016

View reviewed changes

lesteve mentioned this pull request Nov 15, 2016

[MRG + 1] Fix for OvR partial_fit bug #7786

Merged

jnothman modified the milestones: 0.20, 0.19 Jun 13, 2017

glemaitre modified the milestones: 0.20, 0.21 Jun 13, 2018

wdevazelhes mentioned this pull request Feb 25, 2019

[WIP] Common test for equivalence between sparse and dense matrices. #13246

Closed

7 tasks

jeromedockes mentioned this pull request Mar 7, 2019

[MRG] handle sparse x and intercept in _RidgeGCV #13350

Merged

jnothman modified the milestones: 0.21, 0.22 Apr 16, 2019

jnothman removed this from the 0.22 milestone Apr 16, 2019

jnothman closed this Apr 16, 2019

Uh oh!

[WIP] Common test for equivalence between sparse and dense matrices. #7590

[WIP] Common test for equivalence between sparse and dense matrices. #7590

Uh oh!

Conversation

maniteja123 commented Oct 6, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this implement/fix? Explain your changes.

Any other comments?

Notes

Uh oh!

amueller commented Oct 7, 2016

Uh oh!

amueller commented Oct 7, 2016

Uh oh!

jnothman commented Oct 8, 2016

Uh oh!

jnothman commented Oct 10, 2016

Uh oh!

maniteja123 commented Oct 14, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Oct 16, 2016

Uh oh!

maniteja123 commented Oct 16, 2016

Uh oh!

maniteja123 commented Oct 16, 2016

Uh oh!

jnothman commented Oct 18, 2016

Uh oh!

jnothman commented Oct 18, 2016

Uh oh!

maniteja123 commented Oct 19, 2016

Uh oh!

jnothman Oct 19, 2016

Choose a reason for hiding this comment

Uh oh!

jnothman commented Oct 19, 2016

Uh oh!

lesteve commented Nov 15, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Nov 16, 2016

Uh oh!

wdevazelhes commented Feb 25, 2019

Uh oh!

jnothman commented Apr 16, 2019

Uh oh!

Uh oh!

maniteja123 commented Oct 6, 2016 •

edited

Loading

maniteja123 commented Oct 14, 2016 •

edited

Loading

lesteve commented Nov 15, 2016 •

edited

Loading