Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions doc/whats_new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,17 +29,16 @@ Bug fixes
.........

- :class:`RandomizedPCA` default number of `iterated_power` is 2 instead of 3.
This is a speed up with a minor precision decrease. By `Giorgio Patrini`_.
This is a speed up with a minor precision decrease. (`#5141 https://github.com/scikit-learn/scikit-learn/pull/5141>`_) by `Giorgio Patrini`_.

- :func:`randomized_svd` performs 2 power iterations by default, instead or 0.
In practice this is often enough for obtaining a good approximation of the
true eigenvalues/vectors in the presence of noise. By `Giorgio Patrini`_.
true eigenvalues/vectors in the presence of noise. (`#5141 https://github.com/scikit-learn/scikit-learn/pull/5141>`_) by `Giorgio Patrini`_.

- :func:`randomized_range_finder` is more numerically stable when many
power iterations are requested, since it applies LU normalization by default.
If `n_iter<2` numerical issues are unlikely, thus no normalization is applied.
Other normalization options are available: 'none', 'LU' and 'QR'. By
`Giorgio Patrini`_.
Other normalization options are available: 'none', 'LU' and 'QR'. (`#5141 https://github.com/scikit-learn/scikit-learn/pull/5141>`_) by `Giorgio Patrini`_.

- Fixed bug in :func:`manifold.spectral_embedding` where diagonal of unnormalized
Laplacian matrix was incorrectly set to 1. By `Peter Fischer`_.
Expand Down
8 changes: 5 additions & 3 deletions sklearn/decomposition/pca.py
Original file line number Diff line number Diff line change
Expand Up @@ -488,7 +488,9 @@ class RandomizedPCA(BaseEstimator, TransformerMixin):
use fit_transform(X) instead.

iterated_power : int, optional
Number of iterations for the power method. 3 by default.
Number of iterations for the power method. 2 by default.

.. versionchanged:: 0.18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what @amueller @ogrisel suggested was to add before the parameter. You could check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this is good. I'm not sure if it needs a newline above it to render correctly. can you check please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's indeed the case. Fixed now.

screen shot 2015-10-21 at 6 33 19 pm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for checking. It doesn't look to be fixed here, though. maybe you didn't push?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it is :)


whiten : bool, optional
When True (False by default) the `components_` vectors are divided
Expand All @@ -510,8 +512,8 @@ class RandomizedPCA(BaseEstimator, TransformerMixin):
Components with maximum variance.

explained_variance_ratio_ : array, [n_components]
Percentage of variance explained by each of the selected components. \
k is not set then all components are stored and the sum of explained \
Percentage of variance explained by each of the selected components.
k is not set then all components are stored and the sum of explained
variances is equal to 1.0

mean_ : array, [n_features]
Expand Down
14 changes: 13 additions & 1 deletion sklearn/utils/extmath.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,18 +195,24 @@ def randomized_range_finder(A, size, n_iter=2,
Parameters
----------
A: 2D array
The input data matrix
The input data matrix.

size: integer
Size of the return array

n_iter: integer
Number of power iterations used to stabilize the result

power_iteration_normalizer: 'auto' (default), 'QR', 'LU', 'none'
Whether the power iterations are normalized with step-by-step
QR factorization (the slowest but most accurate), 'none'
(the fastest but numerically unstable when `n_iter` is large, e.g.
typically 5 or larger), or 'LU' factorization (numerically stable
but can lose slightly in accuracy). The 'auto' mode applies no
normalization if `n_iter`<=2 and switches to LU otherwise.

.. versionadded:: 0.18

random_state: RandomState or an int seed (0 by default)
A random number generator instance

Expand Down Expand Up @@ -283,6 +289,8 @@ def randomized_svd(M, n_components, n_oversamples=10, n_iter=2,
Number of power iterations (can be used to deal with very noisy
problems).

.. versionchanged:: 0.18

power_iteration_normalizer: 'auto' (default), 'QR', 'LU', 'none'
Whether the power iterations are normalized with step-by-step
QR factorization (the slowest but most accurate), 'none'
Expand All @@ -291,13 +299,17 @@ def randomized_svd(M, n_components, n_oversamples=10, n_iter=2,
but can lose slightly in accuracy). The 'auto' mode applies no
normalization if `n_iter`<=2 and switches to LU otherwise.

.. versionadded:: 0.18

transpose: True, False or 'auto' (default)
Whether the algorithm should be applied to M.T instead of M. The
result should approximately be the same. The 'auto' mode will
trigger the transposition if M.shape[1] > M.shape[0] since this
implementation of randomized SVD tend to be a little faster in that
case.

.. versionchanged:: 0.18

flip_sign: boolean, (True by default)
The output of a singular value decomposition is only unique up to a
permutation of the signs of the singular vectors. If `flip_sign` is
Expand Down