FIX a bug in KernelPCA.inverse_transform #19732

kstoneriv3 · 2021-03-20T10:02:31Z

Reference Issues/PRs

As discussed in #18902, PCA reconstructs the mean of the data while KernelPCA does not. This results in inconsistent inverse transformation by PCA and (linear-)KernelPCA. This inconsistency led to the misunderstanding and the introduction of a bug in #16655.

What does this implement/fix? Explain your changes.

As discussed above, a bug was introduced in KernelPCA.inverse_transform in #16655. This PR removes this bug.

Additionally, I suggest a small modification to the KernelPCA.inverse_transform to improve the compatibility with PCA.inverse_transform when the linear kernel is used. I propose to handle this issue by reconstructing the mean when the linear kernel is used so that KernelPCA.inverse_transform can reconstruct the mean in the same way as PCA.inverse_transform does.

glemaitre · 2021-04-14T21:56:12Z

We had a thorough look at this issue/PR with @ogrisel. So you are completely right that it's unjustified to add the regularization parameter at the reconstruction.

Indeed, we think that it is not really useful to account for the mean loss with the linear kernel. It makes the linear kernel special while one should use PCA in this case. We certainly need to improve the documentation.

So for this PR, we can revert the change that was adding the alpha in the diagonal of K. We need a better unit test though. One thing could be to check the reconstruction error by measuring the Frobenius norm between the original matrix and the reconstruction and check that it is close enough, and this with kernel.

In addition, we think that we can do the following improvements:

advocate to use PCA instead of kernel='linear' (or maybe there is some reasons that we did not think about).
improve the docstring of kernelPCA and notably inverse_transform to mention the approximation when reconstructing;
improve the user guide: (i) discuss inverse_transform and (ii) the reconstruction, discuss the impact of alpha;
improve the example: (i) use a test and not a single set to illustrate the reconstruction approximation, (ii) reproduce the denoising example from Section 4, (iii) illustrate intuitively on the denoising the effect of alpha.

In addition, I am thinking about some other improvements:

change the default of kernel to rbf. 'linear' as default is weird since one should use PCA instead;
it is confusing to have an alpha parameter and alphas_. It might be better to rename alphas_ to eigenvectors_ or components_ and lambdas_ to eigenvalues_ that are more explicit naming.
rename fit_inverse_transform to something more intuitive, e.g. enable_inverse_transform

@kstoneriv3 Would you mind to go forward by reverting the previous PR and improving the test? Do you wish to contribute to the future improvements?

…kernel_pca_inverse_transform

kstoneriv3 · 2021-04-14T23:10:05Z

Indeed, we think that it is not really useful to account for the mean loss with the linear kernel. It makes the linear kernel special while one should use PCA in this case. We certainly need to improve the documentation.

So for this PR, we can revert the change that was adding the alpha in the diagonal of K.

I agree that we should rather revert the bug and improve the documentation.

We need a better unit test though. One thing could be to check the reconstruction error by measuring the Frobenius norm between the original matrix and the reconstruction and check that it is close enough, and this with kernel.

OK, I will replace the old test with this one.

advocate to use PCA instead of kernel='linear' (or maybe there is some reasons that we did not think about).

Do you mean raising a warning when kernel='linear' is used by the following?

I generally agree with all of the suggestions you made. If no further discussions are needed before these updates, I can do it. Would you like to have some of these updates in the form of separate PRs?

ogrisel · 2021-04-15T07:55:05Z

change the default of kernel to rbf. 'linear' as default is weird since one should use PCA instead;

if we do this, we should do it in a dedicated PR with backward compat. I think it's low priority.

ogrisel · 2021-04-15T07:55:42Z

Do you mean raising a warning when kernel='linear' is used by the following?

Improving the docstring would be enough.

ogrisel · 2021-04-15T08:00:17Z

Indeed separate PRs would be helpful. Let's start with undoing the inverse_transform alpha fix and write a new test as part of the current PR.

For instance for the test we can keep a test that checks that we can approximately recover the original data using a frobenius norm of ||X_test - X_test_reconstructed|| with some tolerance (and maybe with a low enough value for the kernel ridge penalty alpha and a large enough n_components and n_samples on the training set compared to the number of features.

For the test data, you could try to use sklearn.datasets.make_swiss_roll and split it into a train and test set using train test split.

kstoneriv3 · 2021-04-15T13:18:39Z

For instance for the test we can keep a test that checks that we can approximately recover the original data using a frobenius norm of ||X_test - X_test_reconstructed|| with some tolerance (and maybe with a low enough value for the kernel ridge penalty alpha and a large enough n_components and n_samples on the training set compared to the number of features.

For the test data, you could try to use sklearn.datasets.make_swiss_roll and split it into a train and test set using train test split.

I tried the Swiss roll but the reconstruction quality was not very good. This is because the Swiss roll kernel PCA's 'rbf' kernel cannot well capture the similarity of data points along the third axis. So I just used make_blobs in the test case.

kstoneriv3 · 2021-04-15T13:38:28Z

I will make changes to the document later.

kstoneriv3 · 2021-04-15T13:48:24Z

I have a question: It seems that centering of the kernel is not applied in the sklearn.kernel_approximation. What would be the disadvantage if we were to remove the centering of kernels in the first place?

glemaitre · 2021-04-15T16:12:25Z

I have a question: It seems that centering of the kernel is not applied in the sklearn.kernel_approximation. What would be the disadvantage if we were to remove the centering of kernels in the first place?

Isn't it what KernelCenterer is doing?

doc/whats_new/v1.0.rst

kstoneriv3 · 2021-04-17T08:47:25Z

I have a question: It seems that centering of the kernel is not applied in the sklearn.kernel_approximation. What would be the disadvantage if we were to remove the centering of kernels in the first place?

Isn't it what KernelCenterer is doing?

It seems to me that KernelCenterer is applied in KernelPCA but not in sklearn.kernel_approximation. So I wondered what would be the pros and cons if we just remove kernel centering from KernelPCA.

kstoneriv3 · 2021-04-17T08:52:07Z

The progress status of this PR:

fix the bug
fix the test
advocate to use PCA instead of kernel='linear' (or maybe there is some reasons that we did not think about).
improve the docstring of kernelPCA and notably inverse_transform to mention the approximation when reconstructing;
improve the user guide: (i) discuss inverse_transform and (ii) the reconstruction, discuss the impact of alpha;

The followings can be included in this PR if we agree on what to rename them.

it is confusing to have an alpha parameter and alphas_. It might be better to rename alphas_ to eigenvectors_ or components_ and lambdas_ to eigenvalues_ that are more explicit naming.
-> I agree with changing them to eigenvectors_ and eigenvalues_.
rename fit_inverse_transform to something more intuitive, e.g. enable_inverse_transform
-> I am not particular about the name of this argument. fit_inverse_transform is a bit obscure but make it clear that we are using a learned pre-image.

The following should be in separate PRs I guess.

improve the example: (i) use a test and not a single set to illustrate the reconstruction approximation, (ii) reproduce the denoising example from Section 4, (iii) illustrate intuitively on the denoising the effect of alpha.
change the default of kernel to rbf. 'linear' as default is weird since one should use PCA instead;

glemaitre · 2021-04-17T09:04:25Z

@kstoneriv3 I think that we can limit this PR to the first two points. It will be easier to review and merge.
Then we could make a PR for the change in the documentation and finally individual PRs for each of the API change.

So I will review shortly this PR (@ogrisel can make the second review maybe). In parallel, do not hesitate to open subsequent PRs already.

kstoneriv3 · 2021-04-17T09:05:52Z

@kstoneriv3 I think that we can limit this PR to the first two points. It will be easier to review and merge.
Then we could make a PR for the change in the documentation and finally individual PRs for each of the API change.

So I will review shortly this PR (@ogrisel can make the second review maybe). In parallel, do not hesitate to open subsequent PRs already.

OK, great. Then I will leave this PR as it is and make separate PRs for other issues.

kstoneriv3 · 2021-04-17T14:13:20Z

I tried to reproduce the "denoising example from Section 4" but the denoising quality is not as good as I expected (left: training images, middle test images to be denoised, right: denoised images). At this quality, I would not add a new example of denoising or mention the effect of alpha on denoising in the document.

This might be due to the fact that sklearn's kernel PCA shares kernel parameters while in the paper they use RBF kernel with different parameters for kernel PCA itself and preimage. The code is available at my gist.

glemaitre · 2021-04-18T21:19:43Z

I tried to reproduce the exact example with the same hyperparameter as in the paper and the same dataset (USPS). I put the code in the "details" section. I get close results to the original paper:

# %%
from sklearn.datasets import fetch_openml

usps = fetch_openml(data_id=41082)

# %%
data = usps.data
target = usps.target

# %%
import numpy as np

img = np.reshape(data.iloc[0].to_numpy(), (16, 16))

# %%
import matplotlib.pyplot as plt

plt.imshow(img)

# %%
from sklearn.model_selection import train_test_split

data_rest, data_train, target_rest, target_train = train_test_split(
    data, target, stratify=target, random_state=42, test_size=100,
)
data_rest, data_test, target_rest, target_test = train_test_split(
    data_rest, target_rest, stratify=target_rest, random_state=42,
    test_size=100,
)
data_train, data_test = data_train.to_numpy(), data_test.to_numpy()

# %%
fig, axs = plt.subplots(nrows=10, ncols=10, figsize=(15, 15))
for img, ax in zip(data_test, axs.ravel()):
    ax.imshow(img.reshape((16, 16)), cmap="Greys")
    ax.axis("off")
_ = fig.suptitle("Uncorrupted test dataset")

# %%
rng = np.random.RandomState(0)
noise = rng.normal(scale=0.5, size=(data_train.shape))
data_test_corrupted = data_test + noise

# %%
fig, axs = plt.subplots(nrows=10, ncols=10, figsize=(15, 15))
for img, ax in zip(data_test_corrupted, axs.ravel()):
    ax.imshow(img.reshape((16, 16)), cmap="Greys")
    ax.axis("off")
_ = fig.suptitle(
    f"Corrupted test data: "
    f"MSE={np.mean((data_test - data_test_corrupted) ** 2):.2f}",
    size=26,
)

# %%
from sklearn.decomposition import KernelPCA

kpca = KernelPCA(
    n_components=80, kernel="rbf", gamma=0.5, fit_inverse_transform=True,
    alpha=1.0,
)

# %%
kpca.fit(data_train)

# %%
import pandas as pd

data_reconstruct = kpca.inverse_transform(kpca.transform(data_test))

# %%
fig, axs = plt.subplots(nrows=10, ncols=10, figsize=(15, 15))
for img, ax in zip(data_reconstruct, axs.ravel()):
    ax.imshow(img.reshape((16, 16)), cmap="Greys")
    ax.axis("off")
_ = fig.suptitle(
    f"Denoising using Kernel PCA with RBF kernel: "
    f"MSE={np.mean((data_test - data_reconstruct) ** 2):.2f}",
    size=26,
)

# %%
from sklearn.decomposition import PCA

pca = PCA(n_components=32)
pca.fit(data_train)
data_reconstruct =  pca.inverse_transform(pca.transform(data_test_corrupted))

# %%
fig, axs = plt.subplots(nrows=10, ncols=10, figsize=(15, 15))
for img, ax in zip(data_reconstruct, axs.ravel()):
    ax.imshow(img.reshape((16, 16)), cmap="Greys")
    ax.axis("off")
_ = fig.suptitle(
    f"Denosing using PCA: "
    f"MSE={np.mean((data_test - data_reconstruct) ** 2):.2f}",
    size=26
)

glemaitre · 2021-04-18T21:22:03Z

We might want to play with the parameter alpha since I am not sure about the standardization which is not super precise and I am not sure which normalization was applied to the dataset available in OpenML.

glemaitre · 2021-04-18T21:29:51Z

For instance, the results look better with a stronger regularization alpha=10:

kstoneriv3 · 2021-04-19T09:40:15Z

@glemaitre Thank you! The denoising quality is surprisingly better! Now it makes sense to add this to examples.

glemaitre

LGTM. I am wondering if we could make the reconstruction better but I don' t know if it is needed. I would rely on a review of @ogrisel

ogrisel

A quick note and this should be good:

sklearn/decomposition/tests/test_kernel_pca.py

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

glemaitre · 2021-04-20T21:13:40Z

@kstoneriv3 thanks for your work. I started to modify the example and we saw with @ogrisel that this bug was really affecting the results of the reconstruction. I will push my changes tomorrow for this example. If you want you can review it.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

…19732

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

fix a bug in KernelPCA.inverse_transform

6613ec1

github-actions bot added the module:decomposition label Mar 20, 2021

kstoneriv3 changed the title ~~fix a bug in KernelPCA.inverse_transform~~ FIX a bug in KernelPCA.inverse_transform Mar 20, 2021

kstoneriv3 added 5 commits March 20, 2021 11:44

fix pylint

c545c1a

fix pylint again

d99c862

fix docstring

5cab26c

add change log

8815c82

enrich the change log

c3822ac

kstoneriv3 mentioned this pull request Mar 25, 2021

Fixed issue with KernelPCA.inverse_transform mean #16655

Merged

glemaitre self-requested a review April 9, 2021 16:19

kstoneriv3 force-pushed the fix/kernel_pca_inverse_transform branch from 83a7e0a to c3822ac Compare April 11, 2021 10:03

edit comments

4def5c2

Merge branch 'main' of github.com:scikit-learn/scikit-learn into fix/…

72526c8

…kernel_pca_inverse_transform

Kei Ishikawa added 2 commits April 15, 2021 08:33

remove unnecessary updates

71f1df5

simplify code

61d9e0e

ogrisel mentioned this pull request Apr 15, 2021

Implement Nystroem.inverse_transform #19899

Open

glemaitre added the Bug label Apr 15, 2021

glemaitre added this to the 0.24.2 milestone Apr 15, 2021

simplify the code

b0322bd

modify whats_new

5c34402

fix inconsistency in docstring

e0bea59

glemaitre reviewed Apr 15, 2021

View reviewed changes

doc/whats_new/v1.0.rst Outdated Show resolved Hide resolved

Merge branch 'main' into fix/kernel_pca_inverse_transform

d738902

This was referenced Apr 17, 2021

EHN Improve variable names in KernelPCA #19908

Merged

ENH Enrich docstring on inverse_transform of KernelPCA #19910

Merged

glemaitre approved these changes Apr 20, 2021

View reviewed changes

ogrisel approved these changes Apr 20, 2021

View reviewed changes

sklearn/decomposition/tests/test_kernel_pca.py Show resolved Hide resolved

Add comments for test_kernel_pca_inverse_transform_reconstruction

67c3c55

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

glemaitre merged commit 4946bfc into scikit-learn:main Apr 20, 2021

This was referenced Apr 21, 2021

DOC improve kernel PCA example #19945

Merged

Release 0.24.2 #19954

Merged

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Apr 22, 2021

FIX fix a bug in KernelPCA.inverse_transform (scikit-learn#19732)

47eab5b

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

kstoneriv3 mentioned this pull request Apr 24, 2021

EHN Add transform_inverse to Nystroem #19971

Draft

TomDLT mentioned this pull request Apr 26, 2021

[MRG after #12145] Add "Randomized SVD" solver option to KernelPCA for faster partial decompositions, like in PCA #12069

Merged

smarie pushed a commit to smarie/scikit-learn that referenced this pull request Apr 27, 2021

Removed obsolete test, see scikit-learn#19732

a64c3a7

smarie pushed a commit to smarie/scikit-learn that referenced this pull request Apr 27, 2021

The removed test was not the correct one. Fixed it. See scikit-learn#…

135f0f7

…19732

glemaitre pushed a commit that referenced this pull request Apr 28, 2021

FIX fix a bug in KernelPCA.inverse_transform (#19732)

9661dd4

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

ageron mentioned this pull request Aug 27, 2021

[BUG] ageron/handson-ml2#464

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX a bug in KernelPCA.inverse_transform #19732

FIX a bug in KernelPCA.inverse_transform #19732

kstoneriv3 commented Mar 20, 2021 •

edited

Loading

glemaitre commented Apr 14, 2021 •

edited

Loading

kstoneriv3 commented Apr 14, 2021

ogrisel commented Apr 15, 2021

ogrisel commented Apr 15, 2021

ogrisel commented Apr 15, 2021 •

edited

Loading

kstoneriv3 commented Apr 15, 2021

kstoneriv3 commented Apr 15, 2021

kstoneriv3 commented Apr 15, 2021

glemaitre commented Apr 15, 2021

kstoneriv3 commented Apr 17, 2021

kstoneriv3 commented Apr 17, 2021 •

edited

Loading

glemaitre commented Apr 17, 2021

kstoneriv3 commented Apr 17, 2021 •

edited

Loading

kstoneriv3 commented Apr 17, 2021 •

edited

Loading

glemaitre commented Apr 18, 2021

glemaitre commented Apr 18, 2021

glemaitre commented Apr 18, 2021

kstoneriv3 commented Apr 19, 2021

glemaitre left a comment

ogrisel left a comment

glemaitre commented Apr 20, 2021

FIX a bug in KernelPCA.inverse_transform #19732

FIX a bug in KernelPCA.inverse_transform #19732

Conversation

kstoneriv3 commented Mar 20, 2021 • edited Loading

Reference Issues/PRs

What does this implement/fix? Explain your changes.

glemaitre commented Apr 14, 2021 • edited Loading

kstoneriv3 commented Apr 14, 2021

ogrisel commented Apr 15, 2021

ogrisel commented Apr 15, 2021

ogrisel commented Apr 15, 2021 • edited Loading

kstoneriv3 commented Apr 15, 2021

kstoneriv3 commented Apr 15, 2021

kstoneriv3 commented Apr 15, 2021

glemaitre commented Apr 15, 2021

kstoneriv3 commented Apr 17, 2021

kstoneriv3 commented Apr 17, 2021 • edited Loading

glemaitre commented Apr 17, 2021

kstoneriv3 commented Apr 17, 2021 • edited Loading

kstoneriv3 commented Apr 17, 2021 • edited Loading

glemaitre commented Apr 18, 2021

glemaitre commented Apr 18, 2021

glemaitre commented Apr 18, 2021

kstoneriv3 commented Apr 19, 2021

glemaitre left a comment

Choose a reason for hiding this comment

ogrisel left a comment

Choose a reason for hiding this comment

glemaitre commented Apr 20, 2021

kstoneriv3 commented Mar 20, 2021 •

edited

Loading

glemaitre commented Apr 14, 2021 •

edited

Loading

ogrisel commented Apr 15, 2021 •

edited

Loading

kstoneriv3 commented Apr 17, 2021 •

edited

Loading

kstoneriv3 commented Apr 17, 2021 •

edited

Loading

kstoneriv3 commented Apr 17, 2021 •

edited

Loading