Implemented Supervised PCA #5196

stylianos-kampakis · 2015-09-01T11:51:55Z

Implemented Supervised PCA algorithm by Bair et al. plus an extension of the model for classification based on logistic regression.

References: Bair, Eric, et al. "Prediction by supervised principal
components." Journal of the American Statistical Association 101.473
(2006).

of the model for classification based on logistic regression. References: Bair, Eric, et al. "Prediction by supervised principal components." Journal of the American Statistical Association 101.473 (2006).

amueller · 2015-09-02T18:09:26Z

Thanks for the PR.
I am not that familiar with the algorithm, and it would be great if you could add examples that compare this against elasticnet and linear discriminant analysis to show the benefit.

Also, is this seems to be the same as
make_pipeline(PCA(), LogisticRegression(), LogisticRegression()), right?
If that is the case, I don't think adding an extra model is warranted, maybe an example?

It is not that easy to do for LogisticRegression, but it will be with #4242.

stylianos-kampakis · 2015-09-04T12:35:07Z

Hello,

It is similar but not equivalent. Supervised PCA also contains an extra step of filtering out useless attributes. So, the steps are

Filter out attributes: Fit a model having only a single feature as input, and if the coefficient is above a threshold, then keep it.
Conduct PCA on the reduced dataset
Fit the final model.

The first step is something that can require a few lines of code, so having a new model makes life a bit easier :)

In the book "The Elements of Statistical Learning" there were some examples on where this technique would be better against elastic net. I am not sure if they contained examples that include LDA. I can look into it.

Best regards,
Stelios

amueller · 2015-09-08T21:45:41Z

Sorry, I misread the code then.
So it is equivalent to make_pipeline(LogisticRegression(), PCA(), LogisticRegression())?
The first logistic regression will drop features below a threshold, then you do a PCA and then train a model on the outcome. I'm not sure what the PCA is for, though? I'll have a look at ESL....

amueller · 2015-09-08T21:50:39Z

Ah, ok, you do drop the number of components. And they use univariate selection. So it is make_pipeline(SelectKBest(), PCA(), LogisticRegression()).
They do use a more fancy univariate selection criterion in the first step, which is more appropriate for survival analysis. That gives them an advantage on their data.

I don't think this particular pipeline deserves it's own estimator.

GaelVaroquaux · 2015-09-08T22:24:58Z

It should probably be an example, then.

…hef, concordance correlation chef, example of concordance vs pearson Added the following: 1) Improved version of supervised PCA 2) Example of supervised PCA against LDA and QDA 3) Example of supervised PCA against elasticNet 4) Pearson and concordance correlation coefficients 5) Example where the concordance correlation coefficient can be better than Pearson

hlin117 · 2015-11-25T23:31:44Z

There's already a somewhat similar example on scikit-learn's documentation. It's only missing the PCA step.
http://scikit-learn.org/stable/auto_examples/feature_selection/feature_selection_pipeline.html

amueller · 2018-09-27T01:55:57Z

closing as no reply and no excitement. This is a pretty straight-forward pipeline imho.

Implemented Supervised PCA algorithm by Bair et al. plus an extension

bc33066

of the model for classification based on logistic regression. References: Bair, Eric, et al. "Prediction by supervised principal components." Journal of the American Statistical Association 101.473 (2006).

stylianos-kampakis added 2 commits September 17, 2015 12:34

Merge remote-tracking branch 'scikit-learn/master'

b2e7b28

amueller closed this Sep 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Implemented Supervised PCA #5196

Implemented Supervised PCA #5196

Uh oh!

stylianos-kampakis commented Sep 1, 2015

Uh oh!

amueller commented Sep 2, 2015

Uh oh!

stylianos-kampakis commented Sep 4, 2015

Uh oh!

amueller commented Sep 8, 2015

Uh oh!

amueller commented Sep 8, 2015

Uh oh!

GaelVaroquaux commented Sep 8, 2015 via email

Uh oh!

hlin117 commented Nov 25, 2015

Uh oh!

amueller commented Sep 27, 2018

Uh oh!

Uh oh!

Uh oh!

Implemented Supervised PCA #5196

Implemented Supervised PCA #5196

Uh oh!

Conversation

stylianos-kampakis commented Sep 1, 2015

Uh oh!

amueller commented Sep 2, 2015

Uh oh!

stylianos-kampakis commented Sep 4, 2015

Uh oh!

amueller commented Sep 8, 2015

Uh oh!

amueller commented Sep 8, 2015

Uh oh!

GaelVaroquaux commented Sep 8, 2015 via email

Uh oh!

hlin117 commented Nov 25, 2015

Uh oh!

amueller commented Sep 27, 2018

Uh oh!

Uh oh!