Skip to content

check_symmetric: adapt tolerance to data type #11159

Closed
@Celelibi

Description

@Celelibi

Description

manifold.MDS by default would compute the distance matrix with the same data type as the input. Then the function _smacof_single calls check_symmetric without a tol argument. Which defaults to 1e-10. If the input matrix contains float32, then the distance matrix does too. And this tolerance is too small for that type.

I guess the best solution would be to make check_symmetric adapt the default tolerance to the data type. I might suggest 1e-4 for float32 and and 1e-2 for float16.

Steps/Code to Reproduce

import numpy as np
from sklearn.manifold import MDS

mat = np.random.rand(1000, 2).astype(np.float32)
MDS().fit_transform(mat)

Given the random initialization, you might need a few run to exhibit the result.

Expected Results

Not raising an exception.

Actual Results

Traceback (most recent call last):
  File "./testmds.py", line 7, in <module>
    MDS().fit_transform(mat)
  File "/usr/lib/python3/dist-packages/sklearn/manifold/mds.py", line 429, in fit_transform
    return_n_iter=True)
  File "/usr/lib/python3/dist-packages/sklearn/manifold/mds.py", line 254, in smacof
    eps=eps, random_state=random_state)
  File "/usr/lib/python3/dist-packages/sklearn/manifold/mds.py", line 70, in _smacof_single
    dissimilarities = check_symmetric(dissimilarities, raise_exception=True)
  File "/usr/lib/python3/dist-packages/sklearn/utils/validation.py", line 707, in check_symmetric
    raise ValueError("Array must be symmetric")
ValueError: Array must be symmetric

Versions

>>> import platform; print(platform.platform())
Linux-4.16.0-1-amd64-x86_64-with-debian-buster-sid
>>> import sys; print("Python", sys.version)
Python 3.6.5 (default, May 11 2018, 13:30:17) 
[GCC 7.3.0]
>>> import numpy; print("NumPy", numpy.__version__)
NumPy 1.14.3
>>> import scipy; print("SciPy", scipy.__version__)
SciPy 1.1.0
>>> import sklearn; print("Scikit-Learn", sklearn.__version__)
Scikit-Learn 0.19.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugEasyWell-defined and straightforward way to resolvehelp wanted

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions