[MRG] Fix missing assert and parametrize some k-means tests #12368

jeremiedbb · 2018-10-12T12:00:24Z

Noticed a missing assert in k-means tests, meaning the test would always pass.

I took the opportunity to parametrize some of the k-means test. I did not make any changes to the tests, just avoided code redundancy. I was doing it in #11950 but it will be more reviewable if I do it here, in a separate PR.

NicolasHug

Very minor comment, other than that LGTM!

NicolasHug · 2018-10-12T14:58:48Z

sklearn/cluster/tests/test_k_means.py

-    km = KMeans(init=centers.copy(), n_clusters=n_clusters, random_state=42,
-                n_init=1)
-    km.fit(X)
+@pytest.mark.parametrize('representation', ['dense', 'sparse'])


Why not directly

@pytest.mark.parametrize('data', [X, X_csr])

It's just for the readability when you run pytest.

With your proposition tests will appear as
test_whatever_test_name[data0]
test_whatever_test_name[data1]

Here it will appear as
test_whatever_test_name[dense]
test_whatever_test_name[sparse]

I just find it easier to track which parameters make some test fail.

Yeah that's a good point

FYI, it's possible to provide ids as a workaround,

@pytest.mark.parametrize('data', (X, Xcsr), ids=('dense', 'sparse'))

maybe that's a bit more direct?

I didn't know that. This is better ! Thanks

rth

Thanks @jeremiedbb this is nice. A few comments below.

rth · 2018-10-13T20:22:07Z

sklearn/cluster/tests/test_k_means.py

-    km = KMeans(init=centers.copy(), n_clusters=n_clusters, random_state=42,
-                n_init=1)
-    km.fit(X)
+@pytest.mark.parametrize('representation', ['dense', 'sparse'])


FYI, it's possible to provide ids as a workaround,

@pytest.mark.parametrize('data', (X, Xcsr), ids=('dense', 'sparse'))

maybe that's a bit more direct?

rth · 2018-10-13T20:25:31Z

sklearn/cluster/tests/test_k_means.py


    # sanity check: predict centroid labels
    pred = mb_k_means.predict(mb_k_means.cluster_centers_)
    assert_array_equal(pred, np.arange(n_clusters))

-    # check that models trained on sparse input also works for dense input at
-    # predict time
-    assert_array_equal(mb_k_means.predict(X), mb_k_means.labels_)


Should we still keep this line?

I moved it in a new function : test_predict_minibatch_dense_sparse.

rth · 2018-10-13T20:26:26Z

sklearn/cluster/tests/test_k_means.py

-
-    # sanity check: predict centroid labels
-    pred = mb_k_means.predict(mb_k_means.cluster_centers_)
-    assert_array_equal(pred, np.arange(n_clusters))


Should we keep these 2 lines as well?

I did keep them :)

in test_predict_minibatch, line 559-560

rth

LGTM, true I missed those :)

Thanks @jeremiedbb and thank you for the review @NicolasHug !

rth · 2018-10-13T21:40:06Z

BTW, Circle CI doesn't seem to be triggering now. https://status.circleci.com/ looks fine, so I'm not sure what happened. Anyway it should not affect this PR.

)

…learn#12368)" This reverts commit 347c272.

jeremiedbb added 4 commits October 12, 2018 13:20

fix missing assert

8301391

refactor kmeans init tests

da049f2

parametrize more tests

da1fd5a

typo

06dc5fd

NicolasHug approved these changes Oct 12, 2018

View reviewed changes

rth reviewed Oct 13, 2018

View reviewed changes

ids in parametrize

a81dbb6

rth approved these changes Oct 13, 2018

View reviewed changes

rth merged commit 76b1078 into scikit-learn:master Oct 13, 2018

jnothman pushed a commit to jnothman/scikit-learn that referenced this pull request Oct 15, 2018

TST Fix missing assert and parametrize k-means tests (scikit-learn#12368

ff3dea3

)

anuragkapale pushed a commit to anuragkapale/scikit-learn that referenced this pull request Oct 23, 2018

TST Fix missing assert and parametrize k-means tests (scikit-learn#12368

62f002d

)

jeremiedbb deleted the fix-test-k-means branch October 24, 2018 11:53

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

TST Fix missing assert and parametrize k-means tests (scikit-learn#12368

347c272

)

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "TST Fix missing assert and parametrize k-means tests (scikit-…

7012a16

…learn#12368)" This reverts commit 347c272.

xhluca pushed a commit to xhluca/scikit-learn that referenced this pull request Apr 28, 2019

Revert "TST Fix missing assert and parametrize k-means tests (scikit-…

97b76c8

…learn#12368)" This reverts commit 347c272.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Fix missing assert and parametrize some k-means tests #12368

[MRG] Fix missing assert and parametrize some k-means tests #12368

jeremiedbb commented Oct 12, 2018

NicolasHug left a comment

NicolasHug Oct 12, 2018

jeremiedbb Oct 12, 2018

NicolasHug Oct 12, 2018

rth Oct 13, 2018

jeremiedbb Oct 13, 2018

rth left a comment

rth Oct 13, 2018

rth Oct 13, 2018

jeremiedbb Oct 13, 2018

rth Oct 13, 2018

jeremiedbb Oct 13, 2018

jeremiedbb Oct 13, 2018

rth left a comment

rth commented Oct 13, 2018 •

edited

Loading

[MRG] Fix missing assert and parametrize some k-means tests #12368

[MRG] Fix missing assert and parametrize some k-means tests #12368

Conversation

jeremiedbb commented Oct 12, 2018

NicolasHug left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rth left a comment

Choose a reason for hiding this comment

rth commented Oct 13, 2018 • edited Loading

rth commented Oct 13, 2018 •

edited

Loading