[MRG] Fix sklearn.base.clone when estimator has any kind of sparse matrix as attribute #6910

lesteve · 2016-06-20T13:33:25Z

Reference Issue

TomDLT · 2016-06-20T15:32:17Z

The test fails with Python 2.6.
Otherwise LGTM

jnothman · 2016-06-20T15:37:51Z

sklearn/base.py

+        data = arr.data if sparse.issparse(arr) else arr
+        return data.flat[0], data.flat[-1]
+    except AttributeError:
+        # Sparse matrices without .data attribute. Only dok_matrix at


Not a strong test, but okay.

jnothman · 2016-06-20T15:39:23Z

LGTM when test failure sorted.

lesteve · 2016-06-21T09:01:23Z

sklearn/tests/test_base.py

+
+    PY26 = sys.version_info[:2] == (2, 6)
+    if PY26:
+        # sp.dok_matrix can not be deepcopied in Python 2.6


For the record:

from copy import deepcopy import numpy as np from scipy import sparse as sp m = sp.dok_matrix(np.eye(5)) deepcopy(m)

fails on Python 2.6 with the error:

AttributeError: shape not found

For some reason it looks like the reconstructed matrix is missing some attributes. I think it's fine not testing dok_matrix for Python 2.6.

Yes, I think it's because Py2.6 has a specialised reconstruction routine for dicts and fails to handle the subclassing correctly... I'm happy with this solution.

when estimator has any kind of sparse matrix as attribute. Add test.

lesteve · 2016-06-21T13:15:46Z

@TomDLT @jnothman merge? You had both +1 conditioned on the tests passing and now they are.

jnothman · 2016-06-21T13:27:46Z

Thanks @lesteve. I'll throw in a what's new.

… and documentation. Fixes #6862 (#6907) * Make KernelCenterer a _pairwise operation Replicate solution to 9a52077 except that `_pairwise` should always be `True` for `KernelCenterer` because it's supposed to receive a Gram matrix. This should make `KernelCenterer` usable in `Pipeline`s. Happy to add tests, just tell me what should be covered. * Adding test for PR #6900 * Simplifying imports and test * updating changelog links on homepage (#6901) * first commit * changed binary average back to macro * changed binomialNB to multinomialNB * emphasis on "higher return values are better..." (#6909) * fix typo in comment of hierarchical clustering (#6912) * [MRG] Allows KMeans/MiniBatchKMeans to use float32 internally by using cython fused types (#6846) * Fix sklearn.base.clone for all scipy.sparse formats (#6910) * DOC If git is not installed, need to catch OSError Fixes #6860 * DOC add what's new for clone fix * fix a typo in ridge.py (#6917) * pep8 * TST: Speed up: cv=2 This is a smoke test. Hence there is no point having cv=4 * Added support for sample_weight in linearSVR, including tests and documentation * Changed assert to assert_allclose and assert_almost_equal, reduced the test tolerance * Fixed pep8 violations and sampleweight format * rebased with upstream

… and documentation. Fixes scikit-learn#6862 (scikit-learn#6907) * Make KernelCenterer a _pairwise operation Replicate solution to scikit-learn@9a52077 except that `_pairwise` should always be `True` for `KernelCenterer` because it's supposed to receive a Gram matrix. This should make `KernelCenterer` usable in `Pipeline`s. Happy to add tests, just tell me what should be covered. * Adding test for PR scikit-learn#6900 * Simplifying imports and test * updating changelog links on homepage (scikit-learn#6901) * first commit * changed binary average back to macro * changed binomialNB to multinomialNB * emphasis on "higher return values are better..." (scikit-learn#6909) * fix typo in comment of hierarchical clustering (scikit-learn#6912) * [MRG] Allows KMeans/MiniBatchKMeans to use float32 internally by using cython fused types (scikit-learn#6846) * Fix sklearn.base.clone for all scipy.sparse formats (scikit-learn#6910) * DOC If git is not installed, need to catch OSError Fixes scikit-learn#6860 * DOC add what's new for clone fix * fix a typo in ridge.py (scikit-learn#6917) * pep8 * TST: Speed up: cv=2 This is a smoke test. Hence there is no point having cv=4 * Added support for sample_weight in linearSVR, including tests and documentation * Changed assert to assert_allclose and assert_almost_equal, reduced the test tolerance * Fixed pep8 violations and sampleweight format * rebased with upstream

lesteve mentioned this pull request Jun 20, 2016

base.clone fails if estimator has dia_matrix as a parameter #6855

Closed

lesteve force-pushed the fix-clone-with-sparse-matrix-attribute branch from 9681190 to 147055f Compare June 20, 2016 14:00

jnothman reviewed Jun 20, 2016
View reviewed changes

lesteve force-pushed the fix-clone-with-sparse-matrix-attribute branch 3 times, most recently from a98038a to a56522c Compare June 21, 2016 08:58

lesteve reviewed Jun 21, 2016
View reviewed changes

Fix sklearn.base.clone

808d584

when estimator has any kind of sparse matrix as attribute. Add test.

lesteve force-pushed the fix-clone-with-sparse-matrix-attribute branch from a56522c to 808d584 Compare June 21, 2016 09:30

jnothman merged commit ccefc2e into scikit-learn:master Jun 21, 2016

lesteve deleted the fix-clone-with-sparse-matrix-attribute branch June 21, 2016 13:30

imaculate pushed a commit to imaculate/scikit-learn that referenced this pull request Jun 23, 2016

Fix sklearn.base.clone for all scipy.sparse formats (scikit-learn#6910)

2accd0c

olologin pushed a commit to olologin/scikit-learn that referenced this pull request Aug 24, 2016

Fix sklearn.base.clone for all scipy.sparse formats (scikit-learn#6910)

edd9b8f

TomDLT pushed a commit to TomDLT/scikit-learn that referenced this pull request Oct 3, 2016

Fix sklearn.base.clone for all scipy.sparse formats (scikit-learn#6910)

ba09c39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG] Fix sklearn.base.clone when estimator has any kind of sparse matrix as attribute #6910

[MRG] Fix sklearn.base.clone when estimator has any kind of sparse matrix as attribute #6910

lesteve commented Jun 20, 2016

TomDLT commented Jun 20, 2016

jnothman Jun 20, 2016

jnothman commented Jun 20, 2016

lesteve Jun 21, 2016 •

edited

Loading

jnothman Jun 21, 2016

lesteve commented Jun 21, 2016

jnothman commented Jun 21, 2016

[MRG] Fix sklearn.base.clone when estimator has any kind of sparse matrix as attribute #6910

[MRG] Fix sklearn.base.clone when estimator has any kind of sparse matrix as attribute #6910

Conversation

lesteve commented Jun 20, 2016

Reference Issue

TomDLT commented Jun 20, 2016

jnothman Jun 20, 2016

Choose a reason for hiding this comment

jnothman commented Jun 20, 2016

lesteve Jun 21, 2016 • edited Loading

Choose a reason for hiding this comment

jnothman Jun 21, 2016

Choose a reason for hiding this comment

lesteve commented Jun 21, 2016

jnothman commented Jun 21, 2016

lesteve Jun 21, 2016 •

edited

Loading