Skip to content

Implement classical MDS #31322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

dkobak
Copy link
Contributor

@dkobak dkobak commented May 6, 2025

Fixes #15272. Supersedes #22330.

This PR implements classical MDS, also known as principal coordinates analysis (PCoA) or Torgerson's scaling, see https://en.wikipedia.org/wiki/Multidimensional_scaling#Classical_multidimensional_scaling. As discussed in #22330, it is implemented as new class ClassicalMDS.

Simple demonstration:

import pylab as plt
import numpy as np

from sklearn.datasets import load_iris
from sklearn.manifold import ClassicalMDS, MDS
from sklearn.decomposition import PCA

X, y = load_iris(return_X_y=True)

Z1 = PCA(n_components=2).fit_transform(X)
Z2 = ClassicalMDS(n_components=2, dissimilarity="euclidean").fit_transform(X)
Z3 = ClassicalMDS(n_components=2, dissimilarity="cosine").fit_transform(X)
Z4 = ClassicalMDS(n_components=2, dissimilarity="manhattan").fit_transform(X)

fig, axs = plt.subplots(nrows=2, ncols=2, figsize=(6, 6), layout="constrained")

axs.flat[0].scatter(Z1[:,0], Z1[:,1], c=y)
axs.flat[0].set_title("PCA")

axs.flat[1].scatter(Z2[:,0], Z2[:,1], c=y)
axs.flat[1].set_title("Classical MDS, Euclidean dist.")

axs.flat[2].scatter(-Z3[:,0], Z3[:,1], c=y)
axs.flat[2].set_title("Classical MDS, cosine dist.")

axs.flat[3].scatter(Z4[:,0], Z4[:,1], c=y)
axs.flat[3].set_title("Classical MDS, Manhattan dist.")

cmds

For consistency, this PR also adds support for non-Euclidean metrics to the MDS class.

Copy link

github-actions bot commented May 6, 2025

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: b4f49d5. Link to the linter CI: here

@dkobak
Copy link
Contributor Author

dkobak commented May 7, 2025

Note 1: I assume this PR won't make it in time for 1.7, which is why I wrote 1.8 in versionchanged. But I am of course happy if it does get merged into 1.7.

Note 2: I think it would make sense to use ClassicalMDS() as the default initialization for MDS(), making this:

MDS().fit(X, init=ClassicalMDS().fit(X))

the default behavior of

MDS().fit(X)

Currently I have not implemented it because I wasn't sure about the best API. MDS() constructor could get a new init={"random", "classical_mds"} parameter, but then fit(init=Z) would need to be able to override that... For comparison, in TSNE() class, init is the parameter of the constructor, not of the fit() function. We could also do that here and deprecate init parameter of the fit()... Not sure it's the most sensible option though. But I do think that TSNE() and MDS() should have the same API with regard to initialization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sklearn MDS vs skbio PCoA
1 participant