Skip to content

PCA implementation does not match Tipping and Bishop #10137

@rdturnermtl

Description

@rdturnermtl

The documentation for sklearn.decomposition.PCA says that
"Implements the probabilistic PCA model from: M. Tipping and C. Bishop, Probabilistic Principal Component Analysis, Journal of the Royal Statistical Society, Series B, 61, Part 3, pp. 611-622 via the score and score_samples methods."

It is unclear what the intention is for the other methods, but to match the MLE for PCA some changes need to be made:

explained_variance_ = (S ** 2) / (n_samples - 1)

on Line 423 needs to be

explained_variance_ = (S ** 2) / n_samples

Since the former is using the unbiased estimate of covariance and the latter is used in Tipping and Bishop eq (11).

Also, the projection of PPCA in Sec 3.3 of Tipping and Bishop is different than the conventional projection used in the implementation here. That needs to be clarified in the documentation. Maybe a separate class needs to be made (PPCA) that more cleanly implements Tipping and Bishop.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions