Fixed issue with KernelPCA.inverse_transform mean #16655

lrjball · 2020-03-07T16:54:51Z

Currently KernelPCA.inverse_transform returns a data set with zero mean, even if the original data did not have zero mean.

I believe that this PR fixes that issue, so that the mean of the inverse-transformed data set is the same as the mean of the original data set.

I've also added a test for this update.

Reference Issues/PRs

Fixes #16654

Added the mean to the inverse_transform to fix the issue, and have added tests as well.

jnothman

Thanks

Please add an entry to the change log at doc/whats_new/v0.23.rst. Like the other entries there, please reference this pull request with :pr: and credit yourself (and other contributors if applicable) with :user:.

Please also note versionchanged in the docstring of inverse_transform

jnothman · 2020-03-08T10:03:32Z

sklearn/decomposition/tests/test_kernel_pca.py

+    kp = KernelPCA(n_components=2, kernel=kernel, fit_inverse_transform=True)
+    X_trans = kp.fit_transform(X)
+    X_inv = kp.inverse_transform(X_trans)
+    assert np.isclose(X.mean(axis=0), X_inv.mean(axis=0)).all()


Should we be able to assert re the values, not just the means?

Good point, I've updated the test now.

- Added entry in whats new v0.23 - Updated tests to check for closeness of full data set, not just closeness of the mean. - Updated the fix for this bug. Realised that it wasn't an issue with the mean not being added on, but instead that self.alpha was not being taken into account in the inverse transform.

lrjball · 2020-03-08T12:09:04Z

I realized that the fix to the bug wasn't actually due to the mean not being added, but was due to the alpha not being handled properly in the inverse_transform. So I have updated the fix and have also added an entry in the change log.

jnothman

Thanks for the update

jnothman · 2020-03-08T20:42:35Z

doc/whats_new/v0.23.rst

@@ -138,6 +138,10 @@ Changelog
  :func:`decomposition.non_negative_factorization` now preserves float32 dtype.
  :pr:`16280` by :user:`Jeremie du Boisberranger <jeremiedbb>`.

+- |Fix| :class:`decomposition.KernelPCA` method ``inverse_transform`` now


Maybe could be "fixed .... in the case that data was not centred" would be more helpful to users

So actually after doing some digging, it was still returning the wrong thing even for centered data. I just only noticed the bug in non-centered data because the mean of the inverse-transformed data was zero when the original data set was not centered.

For example, in 0.22.0 the following still does not work:

import numpy as np from sklearn.datasets import make_blobs from sklearn.decomposition import KernelPCA X, _ = make_blobs(n_samples=100, centers=[[1, 1, 1, 1]], random_state=0) X = X - X.mean(axis=0) kp = KernelPCA(n_components=2, fit_inverse_transform=True) X_trans = kp.fit_transform(X) X_inv = kp.inverse_transform(X_trans) assert np.isclose(X, X_inv).all()

So this PR fixes the inverse_transform function for all X.

However, I can still update the message if there is a better way of phrasing this change.

glemaitre

It seems correct. Just a small change in the test to make an assert on the numpy array.

sklearn/decomposition/tests/test_kernel_pca.py

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

jnothman

Thank you, @lrjball!

jnothman · 2020-03-09T19:51:38Z

@glemaitre, does this have your approval?

glemaitre · 2020-03-10T09:53:02Z

@lrjball Thanks for the fix

kstoneriv3 · 2020-11-24T14:23:05Z

For me, it seems that this PR broke the code. Could you check #18902?

kstoneriv3 · 2021-03-25T12:48:02Z

I realized that the fix to the bug wasn't actually due to the mean not being added, but was due to the alpha not being handled properly in the inverse_transform. ...

Though it is concluded here that the alpha is handled correctly, it is indeed handled correctly in the original implementation. The problem was that the mean was not added to the data when the kernel is linear. When the linear kernel is used, the information of the mean is completely lost by the centering of the kernel, while when non-linear kernel is used, the information of the mean is partially lost.

kstoneriv3 · 2021-03-25T12:48:11Z

@jnothman @glemaitre
As discussed above and in #18902, I believe this PR introduced a bug, so I created the above PR (#19732) to fix it. I would appreciate it if you could take the time to review it.

lrjball added 2 commits March 7, 2020 16:50

Fixed issue with KernelPCA.inverse_transform mean

6805fec

Added the mean to the inverse_transform to fix the issue, and have added tests as well.

fixed linting issue

b3e4136

github-actions bot added the module:decomposition label Mar 7, 2020

jnothman reviewed Mar 8, 2020

View reviewed changes

Updated changelog

293ed17

jnothman reviewed Mar 8, 2020

View reviewed changes

glemaitre reviewed Mar 9, 2020

View reviewed changes

sklearn/decomposition/tests/test_kernel_pca.py Outdated Show resolved Hide resolved

Update sklearn/decomposition/tests/test_kernel_pca.py

b77182b

Co-Authored-By: Guillaume Lemaitre <g.lemaitre58@gmail.com>

jnothman approved these changes Mar 9, 2020

View reviewed changes

glemaitre merged commit 535ef55 into scikit-learn:master Mar 10, 2020

ashutosh1919 pushed a commit to ashutosh1919/scikit-learn that referenced this pull request Mar 13, 2020

BUG Fix issue with KernelPCA.inverse_transform (scikit-learn#16655)

8c8383b

gio8tisu pushed a commit to gio8tisu/scikit-learn that referenced this pull request May 15, 2020

BUG Fix issue with KernelPCA.inverse_transform (scikit-learn#16655)

311a12f

kstoneriv3 mentioned this pull request Nov 24, 2020

A Bug at the inverse_transform of the KernelPCA #18902

Closed

kstoneriv3 mentioned this pull request Mar 20, 2021

FIX a bug in KernelPCA.inverse_transform #19732

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed issue with KernelPCA.inverse_transform mean #16655

Fixed issue with KernelPCA.inverse_transform mean #16655

lrjball commented Mar 7, 2020 •

edited

Loading

jnothman left a comment

jnothman Mar 8, 2020

lrjball Mar 8, 2020

lrjball commented Mar 8, 2020

jnothman left a comment

jnothman Mar 8, 2020

lrjball Mar 8, 2020 •

edited

Loading

glemaitre left a comment

jnothman left a comment

jnothman commented Mar 9, 2020

glemaitre commented Mar 10, 2020

kstoneriv3 commented Nov 24, 2020

kstoneriv3 commented Mar 25, 2021

kstoneriv3 commented Mar 25, 2021 •

edited

Loading

Fixed issue with KernelPCA.inverse_transform mean #16655

Fixed issue with KernelPCA.inverse_transform mean #16655

Conversation

lrjball commented Mar 7, 2020 • edited Loading

Reference Issues/PRs

jnothman left a comment

Choose a reason for hiding this comment

jnothman Mar 8, 2020

Choose a reason for hiding this comment

lrjball Mar 8, 2020

Choose a reason for hiding this comment

lrjball commented Mar 8, 2020

jnothman left a comment

Choose a reason for hiding this comment

jnothman Mar 8, 2020

Choose a reason for hiding this comment

lrjball Mar 8, 2020 • edited Loading

Choose a reason for hiding this comment

glemaitre left a comment

Choose a reason for hiding this comment

jnothman left a comment

Choose a reason for hiding this comment

jnothman commented Mar 9, 2020

glemaitre commented Mar 10, 2020

kstoneriv3 commented Nov 24, 2020

kstoneriv3 commented Mar 25, 2021

kstoneriv3 commented Mar 25, 2021 • edited Loading

lrjball commented Mar 7, 2020 •

edited

Loading

lrjball Mar 8, 2020 •

edited

Loading

kstoneriv3 commented Mar 25, 2021 •

edited

Loading