Skip to content

ENH Change the default n_init and eps for MDS #31117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 37 commits into from
Apr 29, 2025

Conversation

dkobak
Copy link
Contributor

@dkobak dkobak commented Mar 31, 2025

This is a follow-up to #30514 and has been discussed in there to some extent. It fixes two issues:

  1. Current default in MDS is n_init=4, which runs MDS four times. Other sklearn classes that offer this functionality use n_init=1 by default, e.g. sklearn.mixture.GaussianMixture. This appears much more sensible to me. So I change the default to n_init=1.

  2. The convergence criterion was really strange and unmotivated, and the default eps led to really bad underconvergence on some simple datasets. I am changing it to the convergence criterion that (i) roughly follows the R implementation, that (ii) makes sense for both metric and non-metric MDS, and that (iii) is not affected by any rescaling of the input matrix X. The new convergence criterion is ((old_stress - stress) / ((distances.ravel() ** 2).sum() / 2)) < eps and the default eps=1e-6 as in the R implementation.

Apart from that, I fixed the formula for the "normalized stress" aka "stress-1" (as discussed in the previous PR), and added several tests.

I implemented FutureWarnings until v1.9 and corresponding tests.

Here is the result of running this code with the new default parameters on a small subset of Digits dataset.

import pylab as plt
from sklearn.manifold import MDS
from sklearn.datasets import load_digits
import numpy as np

X, y = load_digits(return_X_y=True)

rng = np.random.default_rng(seed=42)
ind = rng.choice(len(X), replace=False, size=200)

mds1 = MDS(random_state=42, metric=True, normalized_stress=True, n_init=1, eps=1e-6)
Z1 = mds1.fit_transform(X[ind])
mds2 = MDS(random_state=42, metric=False, normalized_stress=True, n_init=1, eps=1e-6)
Z2 = mds2.fit_transform(X[ind])

plt.figure(figsize=(8, 4), layout="constrained")
plt.subplot(121)
plt.scatter(Z1[:,0], Z1[:,1], c=y[ind], s=3, cmap="tab10")
plt.title(f"metric MDS\nnorm. stress = {mds1.stress_:.2f}, n_iter = {mds1.n_iter_}")
plt.subplot(122)
plt.scatter(Z2[:,0], Z2[:,1], c=y[ind], s=3, cmap="tab10")
plt.title(f"non-metric MDS\nnorm. stress = {mds2.stress_:.2f}, n_iter = {mds2.n_iter_}")
plt.suptitle("Digits dataset, n=200 subset")
plt.savefig("mds2.png", facecolor="w", dpi=200)

mds

Note that both embeddings converge within ~200 iterations, and that non-metric MDS has lower normalized stress than metric MDS, as expected.

Running this with current default (removing n_init=1, eps=1e-6 from the MDS calls) produces awful results, as the convergence criterion hits way too early:

mds2

Almost the same thing happens on main. So in my opinion the current eps value is dysfunctional, especially for non-metric MDS, and the current n_init value is a waste of computations.

Copy link

github-actions bot commented Mar 31, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: 753913c. Link to the linter CI: here

@dkobak dkobak changed the title ENH Improve the convergence criterion and the default n_init for MDS ENH Change the default n_init and eps for MDS Apr 2, 2025
@dkobak
Copy link
Contributor Author

dkobak commented Apr 9, 2025

@antoinebaker IMHO it would be great to have this in 1.7 so that the deprecation cycle rolls out by 1.9... Would be amazing if you could take a look. Thanks!

@antoinebaker
Copy link
Contributor

Will try ! But not sure we can make it to the 1.7 release, which is coming very soon :)

Copy link
Contributor

@antoinebaker antoinebaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @dkobak for the follow-up PR !

The stopping criterion makes much more sense now. The plot_mds.py example is improved with the tutorial like presentation.

Even if our goal is not to reproduce the R code exactly, could you provide a comparison of the R / sklearn results on some examples to see if the found stress are in the same ballpark ? I guess comparing the embeddings is more difficult because of rotational invariance.

Below a few suggestions to ease maintenance.

@dkobak
Copy link
Contributor Author

dkobak commented Apr 22, 2025

@antoinebaker Thanks for the review, I added all your suggestions.

Even if our goal is not to reproduce the R code exactly, could you provide a comparison of the R / sklearn results on some examples to see if the found stress are in the same ballpark ?

This makes sense, however I am not a R user and so cannot do this very easily... If you think this is important, I can try to set it up.

@dkobak
Copy link
Contributor Author

dkobak commented Apr 22, 2025

Okay, I am trying to set up the comparison. I'm taking the data from here: https://cran.r-project.org/web/packages/smacof/vignettes/mdsnutshell.html

S = np.array([[0.  , 0.86, 0.42, 0.42, 0.18, 0.06, 0.07, 0.04, 0.02, 0.07, 0.09,
        0.12, 0.13, 0.16],
       [0.86, 0.  , 0.5 , 0.44, 0.22, 0.09, 0.07, 0.07, 0.02, 0.04, 0.07,
        0.11, 0.13, 0.14],
       [0.42, 0.5 , 0.  , 0.81, 0.47, 0.17, 0.1 , 0.08, 0.02, 0.01, 0.02,
        0.01, 0.05, 0.03],
       [0.42, 0.44, 0.81, 0.  , 0.54, 0.25, 0.1 , 0.09, 0.02, 0.01, 0.  ,
        0.01, 0.02, 0.04],
       [0.18, 0.22, 0.47, 0.54, 0.  , 0.61, 0.31, 0.26, 0.07, 0.02, 0.02,
        0.01, 0.02, 0.  ],
       [0.06, 0.09, 0.17, 0.25, 0.61, 0.  , 0.62, 0.45, 0.14, 0.08, 0.02,
        0.02, 0.02, 0.01],
       [0.07, 0.07, 0.1 , 0.1 , 0.31, 0.62, 0.  , 0.73, 0.22, 0.14, 0.05,
        0.02, 0.02, 0.  ],
       [0.04, 0.07, 0.08, 0.09, 0.26, 0.45, 0.73, 0.  , 0.33, 0.19, 0.04,
        0.03, 0.02, 0.02],
       [0.02, 0.02, 0.02, 0.02, 0.07, 0.14, 0.22, 0.33, 0.  , 0.58, 0.37,
        0.27, 0.2 , 0.23],
       [0.07, 0.04, 0.01, 0.01, 0.02, 0.08, 0.14, 0.19, 0.58, 0.  , 0.74,
        0.5 , 0.41, 0.28],
       [0.09, 0.07, 0.02, 0.  , 0.02, 0.02, 0.05, 0.04, 0.37, 0.74, 0.  ,
        0.76, 0.62, 0.55],
       [0.12, 0.11, 0.01, 0.01, 0.01, 0.02, 0.02, 0.03, 0.27, 0.5 , 0.76,
        0.  , 0.85, 0.68],
       [0.13, 0.13, 0.05, 0.02, 0.02, 0.02, 0.02, 0.02, 0.2 , 0.41, 0.62,
        0.85, 0.  , 0.76],
       [0.16, 0.14, 0.03, 0.04, 0.  , 0.01, 0.  , 0.02, 0.23, 0.28, 0.55,
        0.68, 0.76, 0.  ]])
D = 1 - S
np.fill_diagonal(D, 0)

I'm running R here: https://webr.r-wasm.org/latest/

library(smacof)
D = 1 - ekman
mds(D)
mds(D, type="ordinal")

Results:

Stress-1 value: 0.131 
Stress-1 value: 0.023 

In Python (this branch):

np.random.seed(42)
MDS(dissimilarity="precomputed", metric=True, normalized_stress=True, n_init=1).fit(D).stress_ # 0.137
MDS(dissimilarity="precomputed", metric=True, normalized_stress=True, n_init=1, eps=1e-6).fit(D).stress_ # 0.132
MDS(dissimilarity="precomputed", metric=False, normalized_stress=True, n_init=1).fit(D).stress_ # 0.372
MDS(dissimilarity="precomputed", metric=False, normalized_stress=True, n_init=1, eps=1e-6).fit(D).stress_ # 0.031

I think it fits well enough! Clearly the new eps value is much more suitable for non-metric MDS.

Note: I cannot get down to 0.023 stress-1 with the Python implementation, and I am not sure why, but this is out of scope of this PR. For reference, here is the R code: https://github.com/cran/smacof/blob/master/R/smacofSym.R.

@dkobak
Copy link
Contributor Author

dkobak commented Apr 23, 2025

@antoinebaker I fixed the issue in my comparison code (had to add np.fill_diagonal(D, 0) in Python), I think it now works well enough, see my updated comment above. Let me know what you think.

@antoinebaker
Copy link
Contributor

@antoinebaker I fixed the issue in my comparison code (had to add np.fill_diagonal(D, 0) in Python), I think it now works well enough, see my updated comment above. Let me know what you think.

Thanks for comparison examples! To share and reproduce, could you provide them as gists for the python and R code ? The python one can be a notebook, the R code just a plain script with the results commented as you have done. Another possibility is to have a small public repository with the two code snippets.

@antoinebaker
Copy link
Contributor

For the stress_ which is a bit worse for the sklearn implementation, I agree that we could investigate further in a follow up PR / issue, and test more extensively the R smacof / sklearn MDS comparison.

This PR is already a much needed improvement for MDS, so +1 to merge it as is.

@dkobak
Copy link
Contributor Author

dkobak commented Apr 24, 2025

Here is the gist, I put the R code into the comments on top: https://gist.github.com/dkobak/3daf73495b3da3b23dfbfd6b0b441030

@dkobak
Copy link
Contributor Author

dkobak commented Apr 24, 2025

Any chance to get it into 1.7? I am not sure what the timeline for that is.

@ogrisel
Copy link
Member

ogrisel commented Apr 25, 2025

Note that the suggestions in my review above are incomplete: tests also need to be updated accordingly.

dkobak and others added 8 commits April 25, 2025 15:33
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
@dkobak
Copy link
Contributor Author

dkobak commented Apr 25, 2025

Thanks @ogrisel. I have edited the PR to make the change of eps effective immediately, and only use FutureWarning for n_iter. CC @antoinebaker.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for the quick follow-up. Here is a final comment but otherwise LGTM.

dkobak and others added 3 commits April 28, 2025 15:52
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Copy link
Contributor

@antoinebaker antoinebaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nitpicks (mainly removing superfluous eps=1e-6 as it is the new default), otherwise LGTM !

dkobak and others added 10 commits April 29, 2025 10:40
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
Co-authored-by: antoinebaker <antoinebaker@users.noreply.github.com>
@ogrisel ogrisel enabled auto-merge (squash) April 29, 2025 09:20
@ogrisel
Copy link
Member

ogrisel commented Apr 29, 2025

I enabled auto-merge so that the PR will be merged if CI is green after this commit.

@dkobak instead of committing individual review suggestions, you can press the "add suggestion to batch" from the diff view of the PR and then do a single commit for a group of suggestions to avoid putting unnecessary CI usage.

@ogrisel ogrisel merged commit 31439d2 into scikit-learn:main Apr 29, 2025
36 checks passed
@dkobak dkobak deleted the mds-default-params branch April 29, 2025 12:14
@dkobak
Copy link
Contributor Author

dkobak commented Apr 29, 2025

Thanks. I noticed that the what's-new entries had wrong markup formatting (mea culpa) and also found many other existing what's-new entries for 1.7 that have wrong markup and render incorrectly. I fixed all of them in #31272.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants