-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
[MRG+1] Add multiplicative-update solver in NMF, with all beta-divergence #5295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I would favor using CD for beta divergences as well. We decided to remove the slow pg solver. I would prefer not add a new slow solver if possible. The KDD paper shows that for generalized KL, we can just use the element-wise Newton update without line search. For other divergences, this will need some verification (unless there are more recent works that cover this?) but in worst case we can implement a simple line search. As a start, it would be nice to implement the CD update for generalized KL to compare. |
Could anybody comment on when this is planned to be merged into master? I'm quite keen on using NMF with generalized KL divergence. |
@mblondel maybe we can do better than multiplicative updates but I think we
should be pragmatic here. We can deprecate after when we have something
better. wdyt?
|
Sorry for the long silence, I have not as much time for this as before. I understand that multiplicative update is an old method and probably not the fastest, yet it is still a good way to improve the NMF in scikit-learn, extending to a lot of different loss functions. |
I compared my implementation of multiplicative update with #2540 (Frobenius norm) and #1348 (KL divergence). Comparing to #2540, the results are identical, provided some slight modifications:
Comparing to #1348, the results are different since it forgets a division in the update |
@TomDLT @mblondel @agramfort This PR is mergeable ? |
|
||
return timeset, err | ||
import matplotlib.pyplot as plt | ||
import pandas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use pandas for benchs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using it in a benchmark does not make it a dependency of the package, as for matplotlib.
@mblondel what's your take on this PR? |
Thanks for the investigation.
Agreed.
+1 on my side and sorry for the late reply. As a big fan of CD, I would be potentially interested in the CD code for GKL divergence as it could be faster in the sparse case. |
cool. Looks like we can proceed then :)
let's make it a WIP->MRG then
|
it is, all reviews will be greatly appreciated. |
However, NMF can also be used with a different function to measure the | ||
distance between X and the matrix product WH. Another typical distance | ||
function used in NMF is the (generalized) Kullback-Leibler (KL) divergence, | ||
also referred as I-divergence: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change a bit the formulation :
"Other distance functions can be used in NMF as, for example, the (generalized) Kullback-Leibler (KL) divergence, also referred as I-divergence:
..math::
d_{KL}(X, Y) = \sum_{i,j} (X_{ij} * log(\frac{X_{ij}}{Y_{ij}}) - X_{ij} + Y_{ij})
Or, the the Itakura-Saito (IS) divergence:"
See if you prefer.
Done |
Thanks. Hm so @mblondel I saw a +1 from you up there, but that wasn't a full review, was it? @agramfort did you review? I see +1 from @ogrisel |
no I did not. No time :(
|
Hi, @TomDLT and @agramfort invited to have a look at the code. I mostly looked at the math aspect of the code and it seems to be correct. After quick experiments on my data, everything seems to work as intended and gives similar results to other simple NMF python codes I tested. As a side note, this one was always either equal or the fastest in my tests, especially for higher dimensions. |
ok let's wait for #7927 |
Done |
Thanks @TomDLT (and @VictorBst for the review). I squash-merged. 🍻 |
🍾 Cool ! Thanks a lot for the reviews ! 🍾 |
Yeah !!! |
Awesome! Congrats! |
This PR is a second part of what have been discussed in #4811. (first part, #4852, merged)
It includes :
Link to the rendered doc
plot_beta_divergence.py

This change is