-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
Description
The documentation for sklearn.decomposition.PCA says that
"Implements the probabilistic PCA model from: M. Tipping and C. Bishop, Probabilistic Principal Component Analysis, Journal of the Royal Statistical Society, Series B, 61, Part 3, pp. 611-622 via the score and score_samples methods."
It is unclear what the intention is for the other methods, but to match the MLE for PCA some changes need to be made:
explained_variance_ = (S ** 2) / (n_samples - 1)
on Line 423 needs to be
explained_variance_ = (S ** 2) / n_samples
Since the former is using the unbiased estimate of covariance and the latter is used in Tipping and Bishop eq (11).
Also, the projection of PPCA in Sec 3.3 of Tipping and Bishop is different than the conventional projection used in the implementation here. That needs to be clarified in the documentation. Maybe a separate class needs to be made (PPCA) that more cleanly implements Tipping and Bishop.