Skip to content

[WIP] Common test for equivalence between sparse and dense matrices. #7590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

maniteja123
Copy link
Contributor

@maniteja123 maniteja123 commented Oct 6, 2016

Solves #1572

What does this implement/fix? Explain your changes.

The common test is to ensure results on sparse and dense matrices are identical

Any other comments?

Some failing tests in estimators of linear_model all of which have separate code paths for sparse and dense matrices. I am not aware of any difference with its working on sparse and dense matrices. Please do let me know if there is any reason for this. I too will try to look into the code path. Thanks.

Notes

In neighbors module, the algorithm is set to brute for sparse matrices, but it is auto by default. So, I have set the param algorithm to brute so that the tests pass.

@amueller
Copy link
Member

amueller commented Oct 7, 2016

In the neighbors module, the tests should pass also with auto....

@amueller
Copy link
Member

amueller commented Oct 7, 2016

That the linear model classification outputs are different are a bit concerning to me... For SGD, maybe increasing the number of iterations might help. I'm confused by Perceptron and PassiveAgressive having different results... hum

@jnothman
Copy link
Member

jnothman commented Oct 8, 2016

Might not get neibhbors passing with auto without merfging #5596

On 8 October 2016 at 04:09, Andreas Mueller notifications@github.com
wrote:

That the linear model classification outputs are different are a bit
concerning to me... For SGD, maybe increasing the number of iterations
might help. I'm confused by Perceptron and PassiveAgressive having
different results... hum


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#7590 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz63muRc2OAtdE7xQUCYvu7znOcxFVks5qxnzPgaJpZM4KPko3
.

@jnothman
Copy link
Member

@maniteja123 please change to [MRG] when you want reviews

@maniteja123
Copy link
Contributor Author

maniteja123 commented Oct 14, 2016

Hi, I have tried to debug the reason for the failure for linear_model but was unable to get any insight. I would appreciate any leads into debugging this. Thanks.

@jnothman
Copy link
Member

you can sometimes get faster feedback by copying the error message in here

@maniteja123
Copy link
Contributor Author

FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(LinearRegression)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 85.0%)
 x: array([ 0.96674273,  1.22343467,  0.44977076,  1.55090581,  1.39961997,
        1.00263803,  1.91457316,  0.68037063,  1.81105886,  1.53250171,
        1.80794754,  1.52524137,  1.01560744,  1.39961997,  0.54757601,...
 y: array([ 0.96795674,  1.25113222,  0.39973542,  1.49745071,  1.225     ,
        0.92083229,  1.87924544,  0.66670258,  1.70889972,  1.46430677,
        1.79683369,  1.55457563,  0.93076946,  1.225     ,  0.67464672,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 85.0%)\n x: array([ 0.96674273,  1.22343467,  0.44977076,  1.55090581,  1.39961997,\n        1.00263803,  1.91457316,  0.68037063,  1.81105886,  1.53250171,\n        1.80794754,  1.52524137,  1.01560744,  1.39961997,  0.54757601,...\n y: array([ 0.96795674,  1.25113222,  0.39973542,  1.49745071,  1.225     ,\n        0.92083229,  1.87924544,  0.66670258,  1.70889972,  1.46430677,\n        1.79683369,  1.55457563,  0.93076946,  1.225     ,  0.67464672,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(PassiveAggressiveClassifier)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 20.0%)
 x: array([1, 0, 0, 2, 2, 1, 2, 0, 2, 2, 2, 2, 1, 2, 0, 1, 0, 1, 0, 2, 2, 0, 2,
       2, 0, 2, 2, 0, 2, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 2])
 y: array([1, 3, 0, 2, 2, 1, 3, 0, 2, 2, 3, 0, 1, 2, 0, 1, 0, 1, 0, 3, 2, 0, 2,
       2, 0, 2, 3, 0, 3, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 20.0%)\n x: array([1, 0, 0, 2, 2, 1, 2, 0, 2, 2, 2, 2, 1, 2, 0, 1, 0, 1, 0, 2, 2, 0, 2,\n       2, 0, 2, 2, 0, 2, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 2])\n y: array([1, 3, 0, 2, 2, 1, 3, 0, 2, 2, 3, 0, 1, 2, 0, 1, 0, 1, 0, 3, 2, 0, 2,\n       2, 0, 2, 3, 0, 3, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(PassiveAggressiveRegressor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 100.0%)
 x: array([ 0.68679362,  1.16641719,  0.01746298,  1.46664109,  0.96385265,
        1.19200123,  1.11207778,  1.08051501,  0.57213576,  1.40547629,
        1.92124901, -0.53745013,  1.18454761,  0.96385265,  0.32975875,...
 y: array([ 1.6564933 ,  1.64894692, -1.20160109,  2.03978364,  0.3009845 ,
        0.84102172,  1.95448014, -0.22068228,  1.34525846,  1.82825671,
        2.16418556,  1.39318311,  0.82337868,  0.3009845 ,  0.09264704,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 100.0%)\n x: array([ 0.68679362,  1.16641719,  0.01746298,  1.46664109,  0.96385265,\n        1.19200123,  1.11207778,  1.08051501,  0.57213576,  1.40547629,\n        1.92124901, -0.53745013,  1.18454761,  0.96385265,  0.32975875,...\n y: array([ 1.6564933 ,  1.64894692, -1.20160109,  2.03978364,  0.3009845 ,\n        0.84102172,  1.95448014, -0.22068228,  1.34525846,  1.82825671,\n        2.16418556,  1.39318311,  0.82337868,  0.3009845 ,  0.09264704,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(Perceptron)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 42.5%)
 x: array([1, 0, 0, 2, 0, 0, 2, 0, 2, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 3, 2, 0, 1,
       0, 0, 0, 3, 0, 0, 1, 0, 2, 0, 0, 2, 2, 0, 0, 0, 2])
 y: array([1, 0, 0, 2, 2, 1, 2, 0, 2, 2, 0, 2, 1, 2, 0, 0, 0, 1, 0, 0, 2, 0, 2,
       2, 0, 2, 0, 0, 0, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 42.5%)\n x: array([1, 0, 0, 2, 0, 0, 2, 0, 2, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 3, 2, 0, 1,\n       0, 0, 0, 3, 0, 0, 1, 0, 2, 0, 0, 2, 2, 0, 0, 0, 2])\n y: array([1, 0, 0, 2, 2, 1, 2, 0, 2, 2, 0, 2, 1, 2, 0, 0, 0, 1, 0, 0, 2, 0, 2,\n       2, 0, 2, 0, 0, 0, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(RANSACRegressor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 100.0%)
 x: array([-1.56293319,  2.8987167 ,  0.1957102 , -0.13742639,  0.27718436,
        0.7128559 ,  1.29639132,  2.12392524,  2.25118566, -0.08698851,
        4.48011825,  0.92483196,  0.6986225 ,  0.27718436,  2.68786926,...
 y: array([ 0.99502537,  1.79343459, -0.23309143,  1.33439117,  1.        ,
        0.57204012,  0.94177303, -0.28714216,  1.50324173,  1.2937121 ,
        2.18721045,  1.59536532,  0.58602159,  1.        , -0.17576423,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 100.0%)\n x: array([-1.56293319,  2.8987167 ,  0.1957102 , -0.13742639,  0.27718436,\n        0.7128559 ,  1.29639132,  2.12392524,  2.25118566, -0.08698851,\n        4.48011825,  0.92483196,  0.6986225 ,  0.27718436,  2.68786926,...\n y: array([ 0.99502537,  1.79343459, -0.23309143,  1.33439117,  1.        ,\n        0.57204012,  0.94177303, -0.28714216,  1.50324173,  1.2937121 ,\n        2.18721045,  1.59536532,  0.58602159,  1.        , -0.17576423,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(Ridge)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 65.0%)
 x: array([ 1.02180472,  1.2303366 ,  0.58611176,  1.48164789,  1.3768671 ,
        1.05220481,  1.71579798,  0.75538989,  1.72506866,  1.46890119,
        1.69111745,  1.42737327,  1.06281154,  1.3768671 ,  0.64343746,...
 y: array([ 1.01949709,  1.23436092,  0.57412104,  1.46699924,  1.33766893,
        1.03115678,  1.70541306,  0.74877941,  1.70222799,  1.45126606,
        1.68622629,  1.43243801,  1.04117054,  1.33766893,  0.66990505,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 65.0%)\n x: array([ 1.02180472,  1.2303366 ,  0.58611176,  1.48164789,  1.3768671 ,\n        1.05220481,  1.71579798,  0.75538989,  1.72506866,  1.46890119,\n        1.69111745,  1.42737327,  1.06281154,  1.3768671 ,  0.64343746,...\n y: array([ 1.01949709,  1.23436092,  0.57412104,  1.46699924,  1.33766893,\n        1.03115678,  1.70541306,  0.74877941,  1.70222799,  1.45126606,\n        1.68622629,  1.43243801,  1.04117054,  1.33766893,  0.66990505,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(RidgeCV)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 90.0%)
 x: array([ 1.16997956,  1.23686115,  0.96742657,  1.33232525,  1.29261239,
        1.16198637,  1.38337165,  1.01153436,  1.43163852,  1.32749414,
        1.40027198,  1.27939317,  1.16625393,  1.29261239,  0.97474283,...
 y: array([ 1.14077769,  1.21801566,  0.93213681,  1.28826936,  1.225     ,
        1.1087318 ,  1.34574413,  0.97518746,  1.38431725,  1.28057257,
        1.37047542,  1.26482089,  1.11253029,  1.225     ,  0.98562192,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 90.0%)\n x: array([ 1.16997956,  1.23686115,  0.96742657,  1.33232525,  1.29261239,\n        1.16198637,  1.38337165,  1.01153436,  1.43163852,  1.32749414,\n        1.40027198,  1.27939317,  1.16625393,  1.29261239,  0.97474283,...\n y: array([ 1.14077769,  1.21801566,  0.93213681,  1.28826936,  1.225     ,\n        1.1087318 ,  1.34574413,  0.97518746,  1.38431725,  1.28057257,\n        1.37047542,  1.26482089,  1.11253029,  1.225     ,  0.98562192,...')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(RidgeClassifierCV)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 5.0%)
 x: array([1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,
       0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0])
 y: array([1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1,
       0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0])
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 5.0%)\n x: array([1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1,\n       0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0])\n y: array([1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1,\n       0, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0])')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(SGDClassifier)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 20.0%)
 x: array([1, 1, 0, 2, 2, 1, 2, 2, 2, 2, 3, 2, 2, 2, 0, 1, 0, 1, 0, 3, 2, 2, 1,
       2, 0, 0, 3, 0, 2, 1, 0, 2, 2, 2, 2, 2, 0, 1, 0, 0])
 y: array([1, 3, 0, 2, 2, 1, 2, 0, 2, 2, 3, 2, 1, 2, 0, 0, 0, 1, 0, 3, 2, 0, 1,
       2, 0, 2, 3, 0, 0, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 20.0%)\n x: array([1, 1, 0, 2, 2, 1, 2, 2, 2, 2, 3, 2, 2, 2, 0, 1, 0, 1, 0, 3, 2, 2, 1,\n       2, 0, 0, 3, 0, 2, 1, 0, 2, 2, 2, 2, 2, 0, 1, 0, 0])\n y: array([1, 3, 0, 2, 2, 1, 2, 0, 2, 2, 3, 2, 1, 2, 0, 0, 0, 1, 0, 3, 2, 0, 1,\n       2, 0, 2, 3, 0, 0, 1, 0, 2, 2, 2, 2, 2, 2, 1, 0, 0])')

======================================================================
FAIL: /home/travis/sklearn_build_oldest/scikit-learn/sklearn/tests/test_common.py.test_non_meta_estimators:check_estimator_sparse_dense(SGDRegressor)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/testing.py", line 826, in __call__
    return self.check(*args, **kwargs)
  File "/home/travis/sklearn_build_oldest/scikit-learn/sklearn/utils/estimator_checks.py", line 1594, in check_estimator_sparse_dense
    assert_array_almost_equal(pred, pred_sp, 2)
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 800, in assert_array_almost_equal
    header=('Arrays are not almost equal to %d decimals' % decimal))
  File "/home/travis/miniconda/envs/testenv/lib/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not almost equal to 2 decimals
(mismatch 100.0%)
 x: array([ 0.74089592,  0.82438143,  0.66343549,  0.69959472,  0.57126915,
        0.602163  ,  0.75274869,  0.65297073,  0.71555488,  0.68398377,
        0.80443575,  0.86861232,  0.6011537 ,  0.57126915,  0.9226158 ,...
 y: array([ 0.24967165,  0.3610464 ,  0.16688444,  0.17912751,  0.00778348,
        0.06265229,  0.24860046,  0.15001561,  0.19515826,  0.15828331,
        0.31979999,  0.41888435,  0.06085973,  0.00778348,  0.5256354 ,...
>>  raise AssertionError('\nArrays are not almost equal to 2 decimals\n\n(mismatch 100.0%)\n x: array([ 0.74089592,  0.82438143,  0.66343549,  0.69959472,  0.57126915,\n        0.602163  ,  0.75274869,  0.65297073,  0.71555488,  0.68398377,\n        0.80443575,  0.86861232,  0.6011537 ,  0.57126915,  0.9226158 ,...\n y: array([ 0.24967165,  0.3610464 ,  0.16688444,  0.17912751,  0.00778348,\n        0.06265229,  0.24860046,  0.15001561,  0.19515826,  0.15828331,\n        0.31979999,  0.41888435,  0.06085973,  0.00778348,  0.5256354 ,...')

@maniteja123
Copy link
Contributor Author

Hi Joel, sorry I didn't paste it because it would send lengthy notifications to everyone. Will remember it from next time. As for the errors, most of them are non deterministic in terms of mismatch( I suppose it is due to random seed), but these 10 estimators are failing always.

@amueller amueller added this to the 0.19 milestone Oct 17, 2016
@jnothman
Copy link
Member

Well, it isn't necessary to post the entirety of them, but giving examples
and your suspicions can help. Sorry I don't have time right now to
investigate.

On 16 October 2016 at 18:23, Maniteja Nandana notifications@github.com
wrote:

Hi Joel, sorry I didn't paste it because it would send lengthy
notifications to everyone. Will remember it from next time. As for the
errors, most of them are non deterministic in terms of mismatch( I suppose
it is due to random seed), but these 10 estimators are failing always.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#7590 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz69m-TTUw5baKKJtVcUYt5QKsub9xks5q0dDdgaJpZM4KPko3
.

@jnothman
Copy link
Member

Can you try set fit_intercept=False by default? this may have something to do with _preprocess_data

@maniteja123
Copy link
Contributor Author

Thanks @jnothman for the suggestion. I did try to set fit_intercept = False but the tests are still failing.

% name)
raise
except Exception:
print("Estimator %s doesn't seem to fail gracefully on "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is being output when there's an AssertionError, which is a bit yuck)

@jnothman
Copy link
Member

A lot of the linear models have different paths for sparse and dense data. For example, where Ridge's solver='auto' it selects between 'cholesky' and 'sparse_cg' by the presence of sparse data or sample weight.

We might not be able to solve this problem. Ideally, we might want to show that sparse and dense data produce the same results under some parameter configuration, rather than the default. This, however, does not fit neatly in our common tests framework.

@lesteve
Copy link
Member

lesteve commented Nov 15, 2016

We might not be able to solve this problem. Ideally, we might want to show that sparse and dense data produce the same results under some parameter configuration, rather than the default. This, however, does not fit neatly in our common tests framework.

Can you not have a list of per-class parameters when the default parameters are not appropriate to compare outputs with dense and sparse input matrices, i.e. something like:

dense_vs_sparse_additional_params = {
    'Ridge': {'solver': 'cholesky'}
    ...
}

Then you would use dense_vs_sparse_additional_params to build the estimators. If the class does not appear in dense_vs_sparse_additional_params then you just use the default parameters.

@jnothman
Copy link
Member

That solution is okay, @lesteve, while we still don't really have solutions for extensible common testing.

@jnothman jnothman modified the milestones: 0.20, 0.19 Jun 13, 2017
@glemaitre glemaitre modified the milestones: 0.20, 0.21 Jun 13, 2018
@wdevazelhes
Copy link
Contributor

I'd like to take this issue for the sprint if no one is working on it

@jnothman jnothman removed this from the 0.22 milestone Apr 16, 2019
@jnothman
Copy link
Member

Closing in preference for #13246. Also untagging with milestone, in preference for tagging the issue (#1572)

@jnothman jnothman closed this Apr 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants