Skip to content

Rescale regularization terms of NMF #20512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Jul 15, 2021

Conversation

jeremiedbb
Copy link
Member

@jeremiedbb jeremiedbb commented Jul 12, 2021

Fixes #20484
Take over #5296

alpha_W and alpha_H replace alpha and regularization which are deprecated. During the deprecation cycle the new parameters are used if they are not left to their default values (then the old params are ignored) and the scaling of the regularization terms is only done if the new params are used.

@ogrisel
Copy link
Member

ogrisel commented Jul 12, 2021

/cc @TomDLT 😉

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Just a few comments below:

Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but I think that there is more discussion to come on the issue.

Comment on lines +1421 to +1425
(
self._l1_reg_W,
self._l1_reg_H,
self._l2_reg_W,
self._l2_reg_H,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it quite nice and readable personally.

@ogrisel
Copy link
Member

ogrisel commented Jul 14, 2021

LGTM but I think that there is more discussion to come on the issue.

I am not sure what you mean by that @glemaitre.

@glemaitre
Copy link
Member

I was referring to #20484 (comment)

From what I understand is that this PR will make that we have different ways of treating penalties across estimators.
I do not personally know what is the best way to handle the problem to reach consistency: considered those as bug fixes or do we need to deprecate the current behavior?

@jeremiedbb
Copy link
Member Author

We already do treat penalties differently. Ridge does not scale alpha while Lasso does for instance. The main issue with NMF is that we have 2 regularization terms (for W and for H) which are not balanced and there is no way for the user to balance them. For most estimators there's only 1 regularization term and even if it's not scaled properly, the user can at least change alpha.

jeremiedbb and others added 4 commits July 15, 2021 12:31
Co-authored-by: Tom Dupré la Tour <tom.dupre-la-tour@m4x.org>
Co-authored-by: Tom Dupré la Tour <tom.dupre-la-tour@m4x.org>
@glemaitre glemaitre merged commit a42a333 into scikit-learn:main Jul 15, 2021
@glemaitre
Copy link
Member

OK. Merging then. Thanks @jeremiedbb

samronsin pushed a commit to samronsin/scikit-learn that referenced this pull request Nov 30, 2021
Co-authored-by: Tom Dupré la Tour <tom.dupre-la-tour@m4x.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NMF regularization should not depend on n_samples
4 participants