Skip to content

Commit 3adeabd

Browse files
authored
DOC better internal docstring for Cython enet_coordinate_descent (#31919)
1 parent ff0d6d1 commit 3adeabd

File tree

1 file changed

+44
-4
lines changed

1 file changed

+44
-4
lines changed

sklearn/linear_model/_cd_fast.pyx

Lines changed: 44 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -110,12 +110,40 @@ def enet_coordinate_descent(
110110
bint random=0,
111111
bint positive=0
112112
):
113-
"""Cython version of the coordinate descent algorithm
114-
for Elastic-Net regression
113+
"""
114+
Cython version of the coordinate descent algorithm for Elastic-Net regression.
115115
116-
We minimize
116+
The algorithm mostly follows [Friedman 2010].
117+
We minimize the primal
118+
119+
P(w) = 1/2 ||y - X w||_2^2 + alpha ||w||_1 + beta/2 ||w||_2^2
120+
121+
The dual for beta = 0, see e.g. [Fercoq 2015] with v = alpha * theta, is
122+
123+
D(v) = -1/2 ||v||_2^2 + y v
124+
125+
with dual feasible condition ||X^T v||_inf <= alpha.
126+
For beta > 0, one uses extended versions of X and y by adding n_features rows
117127
118-
(1/2) * norm(y - X w, 2)^2 + alpha norm(w, 1) + (beta/2) norm(w, 2)^2
128+
X -> ( X) y -> (y)
129+
(sqrt(beta) I) (0)
130+
131+
Note that the residual y - X w is an important ingredient for the estimation of a
132+
dual feasible point v.
133+
At optimum of primal w* and dual v*, one has
134+
135+
v = y* - X w*
136+
137+
The duality gap is
138+
139+
G(w, v) = P(w) - D(v) <= P(w) - P(w*)
140+
141+
The final stopping criterion is based on the duality gap
142+
143+
tol ||y||_2^2 < G(w, v)
144+
145+
The tolerance here is multiplied by ||y||_2^2 to have an inequality that scales the
146+
same on both sides and because one has G(0, 0) = 1/2 ||y||_2^2.
119147
120148
Returns
121149
-------
@@ -127,6 +155,18 @@ def enet_coordinate_descent(
127155
Equals input `tol` times `np.dot(y, y)`. The tolerance used for the dual gap.
128156
n_iter : int
129157
Number of coordinate descent iterations.
158+
159+
References
160+
----------
161+
.. [Friedman 2010]
162+
Jerome H. Friedman, Trevor Hastie, Rob Tibshirani. (2010)
163+
Regularization Paths for Generalized Linear Models via Coordinate Descent
164+
https://www.jstatsoft.org/article/view/v033i01
165+
166+
.. [Fercoq 2015]
167+
Olivier Fercoq, Alexandre Gramfort, Joseph Salmon. (2015)
168+
Mind the duality gap: safer rules for the Lasso
169+
https://arxiv.org/abs/1505.03410
130170
"""
131171

132172
if floating is float:

0 commit comments

Comments
 (0)