Skip to content

Rescale regularization terms of NMF #20512

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Jul 15, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 10 additions & 12 deletions doc/modules/decomposition.rst
Original file line number Diff line number Diff line change
Expand Up @@ -825,25 +825,23 @@ In :class:`NMF`, L1 and L2 priors can be added to the loss function in order
to regularize the model. The L2 prior uses the Frobenius norm, while the L1
prior uses an elementwise L1 norm. As in :class:`ElasticNet`, we control the
combination of L1 and L2 with the :attr:`l1_ratio` (:math:`\rho`) parameter,
and the intensity of the regularization with the :attr:`alpha`
(:math:`\alpha`) parameter. Then the priors terms are:
and the intensity of the regularization with the :attr:`alpha_W` and :attr:`alpha_H`
(:math:`\alpha_W` and :math:`\alpha_H`) parameters. The priors are scaled by the number
of samples (:math:`n\_samples`) for `H` and the number of features (:math:`n\_features`)
for `W` to keep their impact balanced with respect to one another and to the data fit
term as independant as possible of the size of the training set. Then the priors terms
are:

.. math::
\alpha \rho ||W||_1 + \alpha \rho ||H||_1
+ \frac{\alpha(1-\rho)}{2} ||W||_{\mathrm{Fro}} ^ 2
+ \frac{\alpha(1-\rho)}{2} ||H||_{\mathrm{Fro}} ^ 2
(\alpha_W \rho ||W||_1 + \frac{\alpha_W(1-\rho)}{2} ||W||_{\mathrm{Fro}} ^ 2) * n\_features
+ (\alpha_H \rho ||H||_1 + \frac{\alpha_H(1-\rho)}{2} ||H||_{\mathrm{Fro}} ^ 2) * n\_samples

and the regularized objective function is:

.. math::
d_{\mathrm{Fro}}(X, WH)
+ \alpha \rho ||W||_1 + \alpha \rho ||H||_1
+ \frac{\alpha(1-\rho)}{2} ||W||_{\mathrm{Fro}} ^ 2
+ \frac{\alpha(1-\rho)}{2} ||H||_{\mathrm{Fro}} ^ 2

:class:`NMF` regularizes both W and H by default. The :attr:`regularization`
parameter allows for finer control, with which only W, only H,
or both can be regularized.
+ (\alpha_W \rho ||W||_1 + \frac{\alpha_W(1-\rho)}{2} ||W||_{\mathrm{Fro}} ^ 2) * n\_features
+ (\alpha_H \rho ||H||_1 + \frac{\alpha_H(1-\rho)}{2} ||H||_{\mathrm{Fro}} ^ 2) * n\_samples

NMF with a beta-divergence
--------------------------
Expand Down
5 changes: 5 additions & 0 deletions doc/whats_new/v1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -259,6 +259,11 @@ Changelog
unused atoms during the dictionary update was not working as expected.
:pr:`19198` by :user:`Jérémie du Boisberranger <jeremiedbb>`.

- |API| The `alpha` and `regularization` parameters of :class:`decomposition.NMF` and
:func:`decomposition.non_negative_factorization` are deprecated and will be removed
in 1.2. Use the new parameters `alpha_W` and `alpha_H` instead. :pr:`20512` by
:user:`Jérémie du Boisberranger <jeremiedbb>`.

:mod:`sklearn.ensemble`
.......................

Expand Down
Loading