Skip to content

Commit d60b51b

Browse files
committed
removed from narrative documentation
1 parent 32afdc3 commit d60b51b

File tree

3 files changed

+0
-216
lines changed

3 files changed

+0
-216
lines changed

doc/modules/feature_selection.rst

-61
Original file line numberDiff line numberDiff line change
@@ -227,67 +227,6 @@ alpha parameter, the fewer features selected.
227227
Processing Magazine [120] July 2007
228228
http://dsp.rice.edu/sites/dsp.rice.edu/files/cs/baraniukCSlecture07.pdf
229229

230-
.. _randomized_l1:
231-
232-
Randomized sparse models
233-
-------------------------
234-
235-
.. currentmodule:: sklearn.linear_model
236-
237-
In terms of feature selection, there are some well-known limitations of
238-
L1-penalized models for regression and classification. For example, it is
239-
known that the Lasso will tend to select an individual variable out of a group
240-
of highly correlated features. Furthermore, even when the correlation between
241-
features is not too high, the conditions under which L1-penalized methods
242-
consistently select "good" features can be restrictive in general.
243-
244-
To mitigate this problem, it is possible to use randomization techniques such
245-
as those presented in [B2009]_ and [M2010]_. The latter technique, known as
246-
stability selection, is implemented in the module :mod:`sklearn.linear_model`.
247-
In the stability selection method, a subsample of the data is fit to a
248-
L1-penalized model where the penalty of a random subset of coefficients has
249-
been scaled. Specifically, given a subsample of the data
250-
:math:`(x_i, y_i), i \in I`, where :math:`I \subset \{1, 2, \ldots, n\}` is a
251-
random subset of the data of size :math:`n_I`, the following modified Lasso
252-
fit is obtained:
253-
254-
.. math:: \hat{w_I} = \mathrm{arg}\min_{w} \frac{1}{2n_I} \sum_{i \in I} (y_i - x_i^T w)^2 + \alpha \sum_{j=1}^p \frac{ \vert w_j \vert}{s_j},
255-
256-
where :math:`s_j \in \{s, 1\}` are independent trials of a fair Bernoulli
257-
random variable, and :math:`0<s<1` is the scaling factor. By repeating this
258-
procedure across different random subsamples and Bernoulli trials, one can
259-
count the fraction of times the randomized procedure selected each feature,
260-
and used these fractions as scores for feature selection.
261-
262-
:class:`RandomizedLasso` implements this strategy for regression
263-
settings, using the Lasso, while :class:`RandomizedLogisticRegression` uses the
264-
logistic regression and is suitable for classification tasks. To get a full
265-
path of stability scores you can use :func:`lasso_stability_path`.
266-
267-
.. figure:: ../auto_examples/linear_model/images/sphx_glr_plot_sparse_recovery_003.png
268-
:target: ../auto_examples/linear_model/plot_sparse_recovery.html
269-
:align: center
270-
:scale: 60
271-
272-
Note that for randomized sparse models to be more powerful than standard
273-
F statistics at detecting non-zero features, the ground truth model
274-
should be sparse, in other words, there should be only a small fraction
275-
of features non zero.
276-
277-
.. topic:: Examples:
278-
279-
* :ref:`sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py`: An example
280-
comparing different feature selection approaches and discussing in
281-
which situation each approach is to be favored.
282-
283-
.. topic:: References:
284-
285-
.. [B2009] F. Bach, "Model-Consistent Sparse Estimation through the
286-
Bootstrap." https://hal.inria.fr/hal-00354771/
287-
288-
.. [M2010] N. Meinshausen, P. Buhlmann, "Stability selection",
289-
Journal of the Royal Statistical Society, 72 (2010)
290-
http://arxiv.org/pdf/0809.2932.pdf
291230

292231
Tree-based feature selection
293232
----------------------------

doc/modules/linear_model.rst

-5
Original file line numberDiff line numberDiff line change
@@ -205,11 +205,6 @@ computes the coefficients along the full path of possible values.
205205
thus be used to perform feature selection, as detailed in
206206
:ref:`l1_feature_selection`.
207207

208-
.. note:: **Randomized sparsity**
209-
210-
For feature selection or sparse recovery, it may be interesting to
211-
use :ref:`randomized_l1`.
212-
213208

214209
Setting regularization parameter
215210
--------------------------------

examples/linear_model/plot_sparse_recovery.py

-150
This file was deleted.

0 commit comments

Comments
 (0)