Skip to content

PCA supports sparse now, docs suggest otherwise.  #28406

@koaning

Description

@koaning

Describe the issue linked to the documentation

The latest release notes for 1.4 say the following about PCA.

Feature decomposition.PCA now supports scipy.sparse.sparray and scipy.sparse.spmatrix inputs when using the arpack solver. When used on sparse data like datasets.fetch_20newsgroups_vectorized this can lead to speed-ups of 100x (single threaded) and 70x lower memory usage. Based on Alexander Tarashansky’s implementation in scanpy. #18689 by Isaac Virshup and Andrey Portnoy.

However, once you go to the PCA docs it still says this.

Notice that this class does not support sparse input. See TruncatedSVD for an alternative with sparse data.

Suggest a potential alternative/fix

I guess that one sentences can just be removed now? I can whip up a PR if folks agree.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions