-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
Fix Make centering inplace in KernelPCA #29100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Make centering inplace in KernelPCA #29100
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @jeremiedbb
Just curious does this optimization make any difference on a real use case? Is it more some kind of clean-up? I am not a big fan of the function with side-effect which makes it easier to shoot yourself in the foot later. Maybe two suggestions:
|
The purpose of the I added some comments to make it clear that we do in place ops on purpose.
I was trying to build a common test for estimators with a |
sklearn/decomposition/_kernel_pca.py
Outdated
@@ -439,7 +439,9 @@ def fit(self, X, y=None): | |||
self.gamma_ = 1 / X.shape[1] if self.gamma is None else self.gamma | |||
self._centerer = KernelCenterer().set_output(transform="default") | |||
K = self._get_kernel(X) | |||
self._fit_transform(K) | |||
# safe to perform in place operations on K because a copy was made before if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think that was on the thing I was not sure about: K
is a new array, right? Is there is a way somehow that K
is a view on X
for a particular kernel and that transforming K
in place will change X
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K is X in the case of a precomputed kernel and copy=False
, The goal of copy=False
is to allow to change X in place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, the 'precomputed'
case was not obvious to me, can you add this to the comment, and also find a way to mention that copy_X
has already made a copy if needed (this is something you probably mentioned in the issue but that wasn't obvious to me either 😅)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed let's add a comment to make that case explicit.
I am not a big fan of this either 😉. Also since |
The purpose of |
Anyway let's not use this PR as the place to discuss the future of the |
Sorry I was probably not explicit enough with "hope for the best", I meant Yes I agree about not turning this PR into a discussion about the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment is good enough to be able to reconstruct the reasoning, thanks!
KernelPCA
has acopy_X
parameter which makes one think that it may perform inplace operations. The operation that could be done inplace is the centering of the kernel using aKernelCenterer
object. However it makes a copy before applying the centering by default.So currently,
copy_X
has the only effect of making an unnecessary copy. Since there's already acopy_X
parameter, we can safely make the centering inplace infit
.Remark
The centering could also be done inplace in
transform
, but given the semantic ofcopy_X
, it may be confusing. Maybe to leave it as is for now and wait for a global rework of copy/copy_X.scikit-learn/sklearn/decomposition/_kernel_pca.py
Lines 149 to 152 in a63b021
I'm not sure this needs a test or even a entry in the changelog, tell me :)