-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
T-SNE fails for CSR matrix with:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().
Code to reproduce:
from sklearn.neighbors import BallTree, kneighbors_graph
from sklearn.manifold import TSNE
X = np.random.randn(100, 10)
bt = BallTree(X, leaf_size=300)
distances = kneighbors_graph(bt, n_neighbors=40, mode="distance", metric="cosine")
X_embedded = TSNE(n_components=2, metric="precomputed").fit_transform(distances)
Reason:
When distance is square Compressed Sparse Row matrix then np.any(X > 0) is also sparse matrix.
in ()
----> 1 X_embedded = TSNE(n_components=2, metric="precomputed").fit_transform(distances)
2
3 ax = plt.scatter(X_embedded[:,0], X_embedded[:,1], c=clusters[0:len(X_embedded)]).axes/Users/roman/.virtualenvs/wordmap/lib/python2.7/site-packages/sklearn/manifold/t_sne.pyc in fit_transform(self, X, y)
857 Embedding of the training data in low-dimensional space.
858 """
--> 859 embedding = self.fit(X)
860 self.embedding = embedding
861 return self.embedding_/Users/roman/.virtualenvs/wordmap/lib/python2.7/site-packages/sklearn/manifold/t_sne.pyc in _fit(self, X, skip_num_points)
645 if X.shape[0] != X.shape[1]:
646 raise ValueError("X should be a square distance matrix")
--> 647 if np.any(X < 0):
648 raise ValueError("All distances should be positive, the "
649 "precomputed distances given as X is not "/Users/roman/.virtualenvs/wordmap/lib/python2.7/site-packages/scipy/sparse/base.pyc in bool(self)
236 return self.nnz != 0
237 else:
--> 238 raise ValueError("The truth value of an array with more than one "
239 "element is ambiguous. Use a.any() or a.all().")
240 nonzero = boolValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().