[MRG+1] Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA) #8531

AlaaMoussawi · 2017-03-04T20:40:49Z

Fixes #8368
Addresses nan output from fit_transform() function of sklearn/decomposition/kernel_pca.py
Handles nan output when taking square root of zero with nan_to_num function. Suppresses warnings.

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

jnothman · 2017-03-05T12:11:44Z

sklearn/decomposition/kernel_pca.py

+        # All eigenvalues of Covariance matrix are >= 0
+        # Ensure root of zero is not evaluated as nan
+        try:
+            err_mgt = np.seterr(all='ignore')


Please use with np.errstate

jnothman · 2017-03-05T12:16:21Z

sklearn/decomposition/tests/test_kernel_pca.py

+                     gamma=settings_gamma,
+                     coef0=settings_coef0)
+
+    # kpca_transform = kpca.fit_transform(data)


Please remove the debugging code

raghavrv

Thanks for the PR! Please add a whatsnew entry...

raghavrv · 2017-03-13T07:24:48Z

sklearn/decomposition/kernel_pca.py

+        # Ensure root of zero is not evaluated as nan
+
+        with np.errstate(all='ignore'):
+            X_transformed = self.alphas_*np.nan_to_num(np.sqrt(self.lambdas_))


spaces surrounding *

raghavrv · 2017-03-13T07:29:50Z

sklearn/decomposition/tests/test_kernel_pca.py

@@ -3,7 +3,8 @@

 from sklearn.utils.testing import (assert_array_almost_equal, assert_less,
                                   assert_equal, assert_not_equal,
-                                   assert_raises)


Could you rewrite it as one import per line? (helps avoid merge conflicts)

raghavrv · 2017-03-13T07:30:38Z

sklearn/decomposition/tests/test_kernel_pca.py

+                     coef0=settings_coef0)
+
+    output = assert_no_warnings(kpca.fit_transform, data)
+


all blank lines can be removed... (except the one after data maybe)

raghavrv · 2017-03-13T07:31:20Z

sklearn/decomposition/tests/test_kernel_pca.py

+
+    output = assert_no_warnings(kpca.fit_transform, data)
+
+    assert_false(np.isnan(output).any())


(just to confirm, it fails at master?)

The original code would not fail at master, but it would output warning messages, and not give quite the desired output. Also, I'm not quite sure what a "whatsnew" entry is, but I don't think this would qualify. If you feel that it would, could you please provide a link that I can use to understand more about what a "whatsnew" entry is?

raghavrv · 2017-03-13T07:32:35Z

sklearn/decomposition/tests/test_kernel_pca.py

+
+    kpca = KernelPCA(n_components=data.shape[1],
+                     kernel=kernel_type,
+                     degree=settings_degree,


directly substitute all of them as you are anyway using named args for clarity...

jnothman · 2017-03-22T01:34:03Z

Please provide more descriptive commit messages

jnothman

LGTM

lesteve

I think we should set negative eigenvalues (i.e. self.lambdas_ to zero in _fit_transform after the if self.remove_zero_eig or self.n_components is None.

jnothman · 2017-03-22T08:23:01Z

i think the problem may be that sometimes it stores negative zero, no?

…

On 22 Mar 2017 6:47 pm, "Loïc Estève" ***@***.***> wrote: ***@***.**** requested changes on this pull request. I think we should set eigenvalues to zero in _fit_transform here <https://github.com/scikit-learn/scikit-learn/pull/8531/files#diff-d6cfec34cb25a8568bc721be1ce40e44L204> — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#8531 (review)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz687XjYC5Z_EATTAoF8el8Bum4OIIks5roNIkgaJpZM4MTNMY> .

lesteve · 2017-03-22T08:40:38Z

i think the problem may be that sometimes it stores negative zero, no?

I think so. It feels that changing self.lambdas_ in _fit_transform (which is actually called by fit) is a cleaner fix. At the moment the fix is in fit_transform but not transform for example.

jnothman · 2017-03-22T09:44:09Z

Yes, that's a problem.

…

On 22 March 2017 at 19:40, Loïc Estève ***@***.***> wrote: i think the problem may be that sometimes it stores negative zero, no? I think so. It feels that changing self.lambdas_ in _fit_transform (which is actually called by fit) is a cleaner fix. At the moment the fix is in fit_transform but not transform for example. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#8531 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz68OITixFT3ZRqWZSFVAnMcuX1H98ks5roN6HgaJpZM4MTNMY> .

jnothman · 2017-03-22T09:44:44Z

Good catch. Will explicitly setting to zero ensure it's never negative? I suppose so.

…

On 22 March 2017 at 20:44, Joel Nothman ***@***.***> wrote: Yes, that's a problem. On 22 March 2017 at 19:40, Loïc Estève ***@***.***> wrote: > i think the problem may be that sometimes it stores negative zero, no? > > I think so. It feels that changing self.lambdas_ in _fit_transform > (which is actually called by fit) is a cleaner fix. At the moment the > fix is in fit_transform but not transform for example. > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <#8531 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AAEz68OITixFT3ZRqWZSFVAnMcuX1H98ks5roN6HgaJpZM4MTNMY> > . >

lesteve · 2017-03-22T10:17:13Z

Good catch. Will explicitly setting to zero ensure it's never negative? I suppose so.

I am not a KernelPCA expert by any means but it seems from this line that when remove_zero_eig is True, we remove negative values in self.lambdas_ so it may make sense to set them to zero when remove_zero_eig is False, especially since you are going to take np.sqrt to self.lambdas_ further down the line.

Maybe the 'sigmoid' kernel is to blame, I am not sure ... in the stand-alone snippet from the associated issue, kpca.lambdas_ has a negative eigenvalue of the order 1e-3, so it is hard to blame floating point precision for this.

amueller · 2019-08-06T19:35:06Z

What's the status of this? I can still reproduce the issue on master.

jnothman · 2019-10-27T22:16:51Z

can a couple of people find time to think this through?

NicolasHug · 2020-04-01T12:56:41Z

Now that #12145 is merged, the error is

ValueError: There are significant negative eigenvalues (0.19119 of the maximum positive). Either the matrix is not PSD, or there was an issue while computing the eigendecomposition of the matrix.

Tweeking coef0 and / or gamma (or just the kernel) will remove the error.

So it seems that the specific settings of the kernel are the cause of the issue, and it's now properly handled by erroring with a nice message instead of returning NaNs. I'll then close the issue and the PR, but feel free to re-open if I missed something. Thanks everyone for the efforts!

Fix for 8368

9fea9e8

AlaaMoussawi changed the title ~~Fix for 8368~~ Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA) Mar 4, 2017

Message

65dd8e7

jnothman reviewed Mar 5, 2017

View reviewed changes

AlaaMoussawi and others added 3 commits March 8, 2017 18:34

Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA)

9b59dc9

Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA)3

593717e

Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA)4

409afbe

raghavrv suggested changes Mar 13, 2017

View reviewed changes

raghavrv changed the title ~~Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA)~~ [WIP] Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA) Mar 13, 2017

raghavrv added the Bug label Mar 13, 2017

Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA)

5e82199

jnothman reviewed Mar 22, 2017

View reviewed changes

jnothman changed the title ~~[WIP] Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA)~~ [MRG+1] Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA) Mar 22, 2017

lesteve requested changes Mar 22, 2017

View reviewed changes

jnothman added this to the 0.19 milestone Jun 18, 2017

rth modified the milestones: 0.19, 0.22 Jun 16, 2019

jnothman mentioned this pull request Oct 27, 2019

[MRG before #12069] KernelPCA: raise Errors and Warnings according to eigenvalue decomposition numerical/conditioning issues #12145

Merged

jnothman added Needs Decision Requires decision Waiting for Reviewer labels Oct 31, 2019

jnothman modified the milestones: 0.22, 0.23 Oct 31, 2019

github-actions bot added the module:decomposition label Mar 2, 2020

NicolasHug closed this Apr 1, 2020

NicolasHug mentioned this pull request Apr 1, 2020

KPCA fit_transform() will be nan when lambdas_ is negative #8368

Closed

		coef0=settings_coef0)

		output = assert_no_warnings(kpca.fit_transform, data)


		output = assert_no_warnings(kpca.fit_transform, data)

		assert_false(np.isnan(output).any())

Uh oh!

[MRG+1] Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA) #8531

[MRG+1] Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA) #8531

Uh oh!

Conversation

AlaaMoussawi commented Mar 4, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

jnothman Mar 5, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman Mar 5, 2017

Choose a reason for hiding this comment

Uh oh!

raghavrv left a comment

Choose a reason for hiding this comment

Uh oh!

raghavrv Mar 13, 2017

Choose a reason for hiding this comment

Uh oh!

raghavrv Mar 13, 2017

Choose a reason for hiding this comment

Uh oh!

raghavrv Mar 13, 2017

Choose a reason for hiding this comment

Uh oh!

raghavrv Mar 13, 2017

Choose a reason for hiding this comment

Uh oh!

AlaaMoussawi Mar 20, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

raghavrv Mar 13, 2017

Choose a reason for hiding this comment

Uh oh!

jnothman commented Mar 22, 2017

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

lesteve left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jnothman commented Mar 22, 2017 via email

Uh oh!

lesteve commented Mar 22, 2017

Uh oh!

jnothman commented Mar 22, 2017 via email

Uh oh!

jnothman commented Mar 22, 2017 via email

Uh oh!

lesteve commented Mar 22, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amueller commented Aug 6, 2019

Uh oh!

jnothman commented Oct 27, 2019

Uh oh!

NicolasHug commented Apr 1, 2020

Uh oh!

Uh oh!

AlaaMoussawi commented Mar 4, 2017 •

edited

Loading

AlaaMoussawi Mar 20, 2017 •

edited

Loading

lesteve left a comment •

edited

Loading

lesteve commented Mar 22, 2017 •

edited

Loading