Skip to content

LocallyLinearEmbedding : n_neighbors <= n_samples #29715

@Gabriel-Kissin

Description

@Gabriel-Kissin

Describe the bug

Minor bug in LocallyLinearEmbedding's parameter validation:

if n_neighbors >= N:
raise ValueError(
"Expected n_neighbors <= n_samples, but n_samples = %d, n_neighbors = %d"
% (N, n_neighbors)
)

The if condition contradicts the error message in the case that n_neighbors == N. So you get a message like

ValueError: Expected n_neighbors <= n_samples,  but n_samples = 3, n_neighbors = 3"

which doesn't make sense.

Steps/Code to Reproduce

import numpy as np
import sklearn.manifold

X = np.random.randn(3, 5)

embedder = sklearn.manifold.LocallyLinearEmbedding(n_neighbors=X.shape[0])

embedder.fit_transform(X)

Expected Results

n/a

Actual Results

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[1119], line 8
      4 X = np.random.randn(3, 5)
      6 embedder = sklearn.manifold.LocallyLinearEmbedding(n_neighbors=X.shape[0])
----> 8 embedder.fit_transform(X)

File ~/Library/Python/3.12/lib/python/site-packages/sklearn/utils/_set_output.py:313, in _wrap_method_output.<locals>.wrapped(self, X, *args, **kwargs)
    311 @wraps(f)
    312 def wrapped(self, X, *args, **kwargs):
--> 313     data_to_wrap = f(self, X, *args, **kwargs)
    314     if isinstance(data_to_wrap, tuple):
    315         # only wrap the first output for cross decomposition
    316         return_tuple = (
    317             _wrap_data_with_container(method, data_to_wrap[0], X, self),
    318             *data_to_wrap[1:],
    319         )

File ~/Library/Python/3.12/lib/python/site-packages/sklearn/base.py:1473, in _fit_context.<locals>.decorator.<locals>.wrapper(estimator, *args, **kwargs)
   1466     estimator._validate_params()
   1468 with config_context(
   1469     skip_parameter_validation=(
   1470         prefer_skip_nested_validation or global_skip_validation
   1471     )
   1472 ):
-> 1473     return fit_method(estimator, *args, **kwargs)

File ~/Library/Python/3.12/lib/python/site-packages/sklearn/manifold/_locally_linear.py:848, in LocallyLinearEmbedding.fit_transform(self, X, y)
    831 @_fit_context(prefer_skip_nested_validation=True)
    832 def fit_transform(self, X, y=None):
    833     \"\"\"Compute the embedding vectors for data X and transform X.
    834 
    835     Parameters
   (...)
    846         Returns the instance itself.
    847     \"\"\"
--> 848     self._fit_transform(X)
    849     return self.embedding_

File ~/Library/Python/3.12/lib/python/site-packages/sklearn/manifold/_locally_linear.py:795, in LocallyLinearEmbedding._fit_transform(self, X)
    793 X = self._validate_data(X, dtype=float)
    794 self.nbrs_.fit(X)
--> 795 self.embedding_, self.reconstruction_error_ = _locally_linear_embedding(
    796     X=self.nbrs_,
    797     n_neighbors=self.n_neighbors,
    798     n_components=self.n_components,
    799     eigen_solver=self.eigen_solver,
    800     tol=self.tol,
    801     max_iter=self.max_iter,
    802     method=self.method,
    803     hessian_tol=self.hessian_tol,
    804     modified_tol=self.modified_tol,
    805     random_state=random_state,
    806     reg=self.reg,
    807     n_jobs=self.n_jobs,
    808 )
    809 self._n_features_out = self.embedding_.shape[1]

File ~/Library/Python/3.12/lib/python/site-packages/sklearn/manifold/_locally_linear.py:227, in _locally_linear_embedding(X, n_neighbors, n_components, reg, eigen_solver, tol, max_iter, method, hessian_tol, modified_tol, random_state, n_jobs)
    223     raise ValueError(
    224         \"output dimension must be less than or equal to input dimension\"
    225     )
    226 if n_neighbors >= N:
--> 227     raise ValueError(
    228         \"Expected n_neighbors <= n_samples,  but n_samples = %d, n_neighbors = %d\"
    229         % (N, n_neighbors)
    230     )
    232 M_sparse = eigen_solver != \"dense\"
    234 if method == \"standard\":

ValueError: Expected n_neighbors <= n_samples,  but n_samples = 3, n_neighbors = 3"

Versions

System:
    python: 3.12.3 (v3.12.3:f6650f9ad7, Apr  9 2024, 08:18:47) [Clang 13.0.0 (clang-1300.0.29.30)]
executable: /usr/local/bin/python3
   machine: macOS-14.5-arm64-arm-64bit

Python dependencies:
      sklearn: 1.5.0
          pip: 24.0
   setuptools: 70.0.0
        numpy: 1.26.4
        scipy: 1.13.0
       Cython: 3.0.10
       pandas: 2.2.2
   matplotlib: 3.8.4
       joblib: 1.4.2
threadpoolctl: 3.5.0

Built with OpenMP: True

threadpoolctl info:
       user_api: blas
   internal_api: openblas
    num_threads: 11
         prefix: libopenblas
       filepath: /Users/gabriel.kissin/Library/Python/3.12/lib/python/site-packages/numpy/.dylibs/libopenblas64_.0.dylib
        version: 0.3.23.dev
threading_layer: pthreads
   architecture: armv8

       user_api: blas
   internal_api: openblas
    num_threads: 11
         prefix: libopenblas
       filepath: /Users/gabriel.kissin/Library/Python/3.12/lib/python/site-packages/scipy/.dylibs/libopenblas.0.dylib
        version: 0.3.26.dev
threading_layer: pthreads
   architecture: neoversen1

       user_api: openmp
   internal_api: openmp
    num_threads: 11
         prefix: libomp
       filepath: /Users/gabriel.kissin/Library/Python/3.12/lib/python/site-packages/sklearn/.dylibs/libomp.dylib
        version: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions