-
Notifications
You must be signed in to change notification settings - Fork 228
Fix covariance initialization when matrix is not invertible #277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks! Could you briefly describe the fix in the PR text, and confirm that it solves the broken piece of code in the issue? |
metric_learn/_util.py
Outdated
raise LinAlgError("Unable to get a true inverse of the covariance " | ||
"matrix since it is not definite. Try another " | ||
"`{}`, or an algorithm that does not " | ||
"require the `{}` to be strictly positive definite." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could also add a third option for the user here: to reduce the input to avoid linear dependencies, as stated in the description of #276 ? Similarly to
metric-learn/metric_learn/rca.py
Lines 99 to 104 in 1b40c3b
warnings.warn('The inner covariance matrix is not invertible, ' | |
'so the transformation matrix may contain Nan values. ' | |
'You should remove any linearly dependent features and/or ' | |
'reduce the dimensionality of your input, ' | |
'for instance using `sklearn.decomposition.PCA` as a ' | |
'preprocessing step.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I totally agree!
For the test, maybe you can extend/take inspiration from metric-learn/test/test_mahalanobis_mixin.py Line 572 in 1e42acb
|
metric_learn/_util.py
Outdated
.format(*((matrix_name,) * 2))) | ||
M = np.dot(u / s, u.T) | ||
if strict_pd: | ||
s, u = scipy.linalg.eigh(M_inv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can simply import eigh
instead of the whole scipy
library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes sure, did it that way just because it was done previously like that.
metric_learn/_util.py
Outdated
if strict_pd: | ||
s, u = scipy.linalg.eigh(M_inv) | ||
cov_is_definite = _check_sdp_from_eigen(s) | ||
if not cov_is_definite: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a test case to exercise the issue you are trying to fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course! I will build a test based on the suggestion of @wdevazelhes
Looking at |
Good catch. We should return the pseudo-inverse there as well when |
Since we are not going to need the eigenvalue & eigenvectors to know positive definitiveness, would it be a good idea to replace it by the Cholesky Factorization which is a more efficient way to do so? |
Note that since we always compute |
You are right! I can compute the SVD and with that I can check the eigenvalues and also calculate the pseudo inverse. I will implement it that way then. |
This last commit broke two tests, but I suspect they are not well set as they ask for a Edit: For some reason, there seem to be many more errors than what I got on my local repo. |
Why switch from |
What do you mean? MMC does require PSD matrices (like any Mahalanobis metric learning algorithm). Maybe you meant it does not require strictly PD? |
Is it possible to get the pseudoinverse from the eigenvalue decomposition? I looked at the implementation of |
My bad! That is what I meant. But I was wrong as the error raises when there are negative eigenvalues. I suspect the problem should be in the tolerance of
Any idea about the other failed test? Seems like an outdated version of numpy is being used. |
SVD loses the sign information from the eigenvalues so PSD cannot be checked with them, this clearly explains the failed tests. I think I will have to roll back the last commit then. Sorry for the confusion! |
Yes since the SVD is the same as the eigenvalue decomposition for a symmetric matrix, see for instance |
Yes, thank you! I was just looking at that. |
For computing the pseudo-inverse, you should simply follow what is done in |
metric_learn/_util.py
Outdated
if strict_pd and not init_is_definite: | ||
raise LinAlgError("You should provide a strictly positive definite " | ||
"matrix as `{}`. This one is not definite. Try another" | ||
" {}, or an algorithm that does not " | ||
"require the {} to be strictly positive definite." | ||
.format(*((matrix_name,) * 3))) | ||
elif return_inverse and not init_is_definite: | ||
warnings.warn('The initialization matrix is not invertible, ' | ||
'but this isn"t an issue as the pseudo-inverse is used.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of the second part maybe simply "using the pseudo-inverse instead"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I was looking for a better way to phrase that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things look good - you mainly need to write tests and we're good to go. You should definitely write at least one test for your _pseudo_inverse_from_eig
function, and tests for the non-invertible cases.
Also it would be nice to check whether we have one for when the initialized matrix is not symmetric. According to https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.eigh.html :
"If eigenvalue computation does not converge, an error occurred, or b matrix is not definite positive. Note that if input matrices are not symmetric or hermitian, no error is reported but results will be wrong."
So it is not clear what happens for us and whether we correctly deal with this
s, u = scipy.linalg.eigh(init) | ||
init_is_definite = _check_sdp_from_eigen(s) | ||
if isinstance(M, np.ndarray): | ||
w, V = eigh(M, check_finite=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you turn off check_finite
? I think it is better to keep it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check_array()
does it previously, so it shouldn't be necessary to do it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed!
metric_learn/_util.py
Outdated
if strict_pd and not init_is_definite: | ||
raise LinAlgError("You should provide a strictly positive definite " | ||
"matrix as `{}`. This one is not definite. Try another" | ||
" {}, or an algorithm that does not " | ||
"require the {} to be strictly positive definite." | ||
.format(*((matrix_name,) * 3))) | ||
elif return_inverse and not init_is_definite: | ||
warnings.warn('The initialization matrix is not invertible, ' | ||
'but this isn"t an issue as the pseudo-inverse is used.') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe simply "The initialization matrix is not invertible: using the pseudo-inverse instead"
metric_learn/_util.py
Outdated
M = np.dot(u / s, u.T) | ||
elif not cov_is_definite: | ||
warnings.warn('The inverse covariance matrix is not invertible, ' | ||
'but this isn"t an issue as the pseudo-inverse is used. ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
metric_learn/_util.py
Outdated
elif not cov_is_definite: | ||
warnings.warn('The inverse covariance matrix is not invertible, ' | ||
'but this isn"t an issue as the pseudo-inverse is used. ' | ||
'You can remove any linearly dependent features and/or ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for clarity, maybe start this sentence by "To make the inverse covariance matrix invertible, "
Great! I am making the tests now, should be done soon. I don't think there should be an issue with non-symmetric matrices as this is checked for the array init case and the covariance matrix should always be symmetric. I think there is a missing test for the symmetry checking of array init that should be implemented. |
Indeed, I had missed the symmetry check for |
(but in seems that we are not: https://codecov.io/gh/scikit-learn-contrib/metric-learn/src/master/metric_learn/_util.py#L671) |
Great! With the other tests this should be ready then. |
metric_learn/_util.py
Outdated
warnings.warn('The inverse covariance matrix is not invertible, ' | ||
'but this isn"t an issue as the pseudo-inverse is used. ' | ||
'You can remove any linearly dependent features and/or ' | ||
warnings.warn('The initialization matrix is not invertible: ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initialization matrix --> covariance matrix
metric_learn/_util.py
Outdated
'You can remove any linearly dependent features and/or ' | ||
warnings.warn('The initialization matrix is not invertible: ' | ||
'using the pseudo-inverse instead.' | ||
'To make the inverse covariance matrix invertible' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the inverse covariance --> the covariance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the above, it would be be good to write a test for _pseudo_inverse_from_eig
to verify that it returns the same as when calling pinvh
on the original matrix.
"""Tests that when using the 'covariance' init or prior, it returns the | ||
appropriate warning if the covariance matrix is singular, for algorithms | ||
that don't need a strictly PD init. Also checks that the returned | ||
inverse matrix has finite values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not what you check, right? You seem to check the learned M
.
Any reason you do this and not a call to _initialize_metric_mahalanobis
like in the other test? Maybe you could check both things in both tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are totally right! Forgot to set the max iteration in order to make it stop before the first iteration.
I called _initialize_metric_mahalanobis
directly in the other test because currently there is no algorithm that could reproduce that setting. But setting max_iter = 0
should keep it from iterating and would result in the M
from the initialization.
Edit: It isn't possible to set max_iter = 0
, so instead I will do it like you sugested.
X, y = shuffle(*make_blobs(random_state=rng), | ||
random_state=rng) | ||
P = ortho_group.rvs(X.shape[1], random_state=rng) | ||
w = np.abs(rng.randn(X.shape[1])) | ||
w[0] = w0 | ||
M = P.dot(np.diag(w)).dot(P.T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems a bit convoluted just to generated a singular matrix? You could just draw a rectangular matrix L
and construct M
as L.T.dot(L)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you, just did it that way because it was done like that on test_singular_array_init_or_prior_strictpd
. But also this way you have only one zero eigenvalue and control its value directly with the parametrization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the random seed thing, LGTM!
test/test_utils.py
Outdated
def test_pseudo_inverse_from_eig_and_pinvh_singular(w0): | ||
"""Checks that _pseudo_inverse_from_eig return the same result as | ||
scipy.linalg.pinvh for a singular matrix""" | ||
A = np.random.rand(100, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should fix the random seed
test/test_utils.py
Outdated
def test_pseudo_inverse_from_eig_and_pinvh_nonsingular(): | ||
"""Checks that _pseudo_inverse_from_eig return the same result as | ||
scipy.linalg.pinvh for a non singular matrix""" | ||
A = np.random.rand(100, 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice work @grudloff !
Thanks! |
Nitpick: calling
|
Thanks @grudloff ! Merging |
This PR fixes #276, an issue that arises in the context of covariance initialization on an algorithm that doesn't require a PD matrix. It can happen that the calculated matrix is singular and in consequence isn't invertible, leading to non-explicative warnings. This is fixed by the use of a pseudo-inverse when a PD matrix is not required.