Skip to content

[MRG] Online implementation of non-negative matrix factorization #16948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 349 commits into from
Apr 22, 2022

Conversation

cmarmo
Copy link
Contributor

@cmarmo cmarmo commented Apr 17, 2020

Reference Issues/PRs

Continues #13386
Aim to fix #13308, fix #13326.

What does this implement/fix? Explain your changes.

Implement Online non-negative matrix factorization, following
Online algorithms for nonnegative matrix factorization with the Itakura-Saito divergence, A Lefevre, F Bach, C Févotte, 2011.

@cmarmo
Copy link
Contributor Author

cmarmo commented Aug 30, 2020

@GaelVaroquaux , @jeremiedbb this is still WIP but I feel like its status needs an update.
I have rerun benchmarks for commit 0020eb6 , the result is below
bench_topics_drago
With respect to the previous implementation:

  • I am no longer hardcoding the number of iterations for the minibatch algorithm: the convergence time is still better than for the classical NMF;
  • I have introduced the forgetting factor: a 'weight' to be applied to old batches (hardcoded for now).

I have also checked that the minibatch algorithm gives the same results as the single batch one, when the batch size is set to the number of samples (I have added a test for that).

From the plot, the loss is greater in the minibatch implementation, but in some cases it seems to be comparable... I am planning to investigate the role of the forgetting factor on the loss: from Fevotte et al. it seems that this factor and then a good solution, depend on the number of samples and the batch size.

Here what is still needed for this PR:

  • investigate the role of the forgetting factor in the loss estimation (and improve it, hopefully it turns out that the expected loss should be investigated instead).
  • make the forgetting factor a parameter and suggest possible optimizations depending on the dimensions of the problem
  • avoid code duplication (non_negative_factorization and non_negative_factorization_online are quite the same function right now)
  • improve tests (I am open to suggestions for testing those lines)
    image
  • write documentation (!)

Thanks for listening and let me know if you have any comment or suggestion.

@glemaitre glemaitre self-requested a review March 22, 2022 15:09
Co-authored-by: Patricio Cerda <pcerda>
Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not check the math, the rest looks pretty good.

return self

def _solve_W(self, X, H, max_iter):
"""Minimize the objective function w.r.t W"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would make it easier for somebody like me to read/review if we could explain what these methods do in the docstrings.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to make it more explicit. Let me know what you think

return W

def partial_fit(self, X, y=None, W=None, H=None):
"""Update the model using the data in X as a mini-batch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we explain how a user could use partial_fit on a dataset to get the same result as running fit on its entirety? It would make it easier for people to decide how to use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I improved the docstring and linked to the doc of incremental learning. Maybe we could add a section there to explain with more details how to use partial_fit, but I think it should be done in a separate PR since it concerns many estimators.

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just posted the comment that I had from before. You can discard the comment that could be outdated now.

self._check_params(X)

if X.min() == 0 and self._beta_loss <= 0:
raise ValueError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently, we don't check for this error in the test as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A review mainly about nitpicks on the documentation just for the format.


.. math::

0.5 * ||X - WH||_{loss}^2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if . would be better than * for the mulitplication.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use * everywhere. Let's keep consistency

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the doc nitpicks are included.

batch_size = 3
max_iter = 1000

rng = np.random.mtrand.RandomState(42)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question here: do we want to add support for the global random state fixture?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the mess it caused so far, I'd rather do that very carefully in a follow up PR :)

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@glemaitre glemaitre merged commit 69132eb into scikit-learn:main Apr 22, 2022
@jeremiedbb
Copy link
Member

Thanks @cmarmo !

@glemaitre
Copy link
Member

Thanks everyone.

@jjerphan
Copy link
Member

Thanks to everyone involved.

@cmarmo cmarmo deleted the modified_nmf_for_minibatch branch April 22, 2022 20:33
jjerphan pushed a commit to jjerphan/scikit-learn that referenced this pull request Apr 29, 2022
…t-learn#16948)

Co-authored-by: Tom Dupré la Tour <tom.dupre-la-tour@m4x.org>
Co-authored-by: jeremie du boisberranger <jeremiedbb@yahoo.fr>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Online implementation of Non-negative Matrix Factorizarion (NMF)
10 participants