-
-
Notifications
You must be signed in to change notification settings - Fork 26k
[MRG+1] Fix for 8368 (Addresses nan output from fit_transform() in kernelPCA) #8531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sklearn/decomposition/kernel_pca.py
Outdated
# All eigenvalues of Covariance matrix are >= 0 | ||
# Ensure root of zero is not evaluated as nan | ||
try: | ||
err_mgt = np.seterr(all='ignore') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use with np.errstate
gamma=settings_gamma, | ||
coef0=settings_coef0) | ||
|
||
# kpca_transform = kpca.fit_transform(data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the debugging code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Please add a whatsnew entry...
sklearn/decomposition/kernel_pca.py
Outdated
# Ensure root of zero is not evaluated as nan | ||
|
||
with np.errstate(all='ignore'): | ||
X_transformed = self.alphas_*np.nan_to_num(np.sqrt(self.lambdas_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spaces surrounding *
@@ -3,7 +3,8 @@ | |||
|
|||
from sklearn.utils.testing import (assert_array_almost_equal, assert_less, | |||
assert_equal, assert_not_equal, | |||
assert_raises) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rewrite it as one import per line? (helps avoid merge conflicts)
coef0=settings_coef0) | ||
|
||
output = assert_no_warnings(kpca.fit_transform, data) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all blank lines can be removed... (except the one after data
maybe)
|
||
output = assert_no_warnings(kpca.fit_transform, data) | ||
|
||
assert_false(np.isnan(output).any()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(just to confirm, it fails at master?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code would not fail at master, but it would output warning messages, and not give quite the desired output. Also, I'm not quite sure what a "whatsnew" entry is, but I don't think this would qualify. If you feel that it would, could you please provide a link that I can use to understand more about what a "whatsnew" entry is?
|
||
kpca = KernelPCA(n_components=data.shape[1], | ||
kernel=kernel_type, | ||
degree=settings_degree, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
directly substitute all of them as you are anyway using named args for clarity...
Please provide more descriptive commit messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should set negative eigenvalues (i.e. self.lambdas_
to zero in _fit_transform
after the if self.remove_zero_eig or self.n_components is None
.
i think the problem may be that sometimes it stores negative zero, no?
…On 22 Mar 2017 6:47 pm, "Loïc Estève" ***@***.***> wrote:
***@***.**** requested changes on this pull request.
I think we should set eigenvalues to zero in _fit_transform here
<https://github.com/scikit-learn/scikit-learn/pull/8531/files#diff-d6cfec34cb25a8568bc721be1ce40e44L204>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8531 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz687XjYC5Z_EATTAoF8el8Bum4OIIks5roNIkgaJpZM4MTNMY>
.
|
I think so. It feels that changing |
Yes, that's a problem.
…On 22 March 2017 at 19:40, Loïc Estève ***@***.***> wrote:
i think the problem may be that sometimes it stores negative zero, no?
I think so. It feels that changing self.lambdas_ in _fit_transform (which
is actually called by fit) is a cleaner fix. At the moment the fix is in
fit_transform but not transform for example.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8531 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEz68OITixFT3ZRqWZSFVAnMcuX1H98ks5roN6HgaJpZM4MTNMY>
.
|
Good catch. Will explicitly setting to zero ensure it's never negative? I
suppose so.
…On 22 March 2017 at 20:44, Joel Nothman ***@***.***> wrote:
Yes, that's a problem.
On 22 March 2017 at 19:40, Loïc Estève ***@***.***> wrote:
> i think the problem may be that sometimes it stores negative zero, no?
>
> I think so. It feels that changing self.lambdas_ in _fit_transform
> (which is actually called by fit) is a cleaner fix. At the moment the
> fix is in fit_transform but not transform for example.
>
> —
> You are receiving this because you commented.
> Reply to this email directly, view it on GitHub
> <#8531 (comment)>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/AAEz68OITixFT3ZRqWZSFVAnMcuX1H98ks5roN6HgaJpZM4MTNMY>
> .
>
|
I am not a KernelPCA expert by any means but it seems from this line that when Maybe the |
What's the status of this? I can still reproduce the issue on master. |
can a couple of people find time to think this through? |
Now that #12145 is merged, the error is
Tweeking So it seems that the specific settings of the kernel are the cause of the issue, and it's now properly handled by erroring with a nice message instead of returning NaNs. I'll then close the issue and the PR, but feel free to re-open if I missed something. Thanks everyone for the efforts! |
Fixes #8368
Addresses nan output from fit_transform() function of sklearn/decomposition/kernel_pca.py
Handles nan output when taking square root of zero with nan_to_num function. Suppresses warnings.
Reference Issue
What does this implement/fix? Explain your changes.
Any other comments?