-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
sklearn PCA with n_components = 'mle' and svd_solver = 'full' results in math domain error #10217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There are known issues with
|
I can reproduce this issue running your code snippet on the test_dataset.csv you provided on the |
Sure, fine with me, @thechargedneutron . Thanks! It might be worth checking if #4827 addresses this and possibly continue that (stalled?) PR if the original author doesn't respond. |
@rth What should be the ideal behaviour? How should equal spectrum be treated? |
@thechargedneutron I have not read the reference paper, and I'm not certain what would be the ideal behavior in this case. |
Closing as duplicate (part) of #4441 |
Description
sklearn PCA with n_components = 'mle' and svd_solver = 'full' results in math domain error
The problem is in this line of code. The result of
(spectrum[i] - spectrum[j])
is 0 and therefore i getlog(0)
which causes this exception.Is this a sign of bad data or should the implementation handle this case?
Steps/Code to Reproduce
Store this in a file i.e. foo.csv
Expected Results
I expect data to have dimensions (X,Y) with X <= 26 and Y <=26
Actual Results
Versions
Windows-10-10.0.14393-SP0
Python 3.6.2 (v3.6.2:5fd33b5, Jul 8 2017, 04:57:36) [MSC v.1900 64 bit (AMD64)]
NumPy 1.13.1
SciPy 0.19.1
Scikit-Learn 0.19.0
Side References
I asked a question regarding this problem on stackoverflow
The text was updated successfully, but these errors were encountered: