Skip to content

[MRG+1] LDA refactoring #3523

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 56 commits into from
Dec 19, 2014
Merged

[MRG+1] LDA refactoring #3523

merged 56 commits into from
Dec 19, 2014

Conversation

cbrnr
Copy link
Contributor

@cbrnr cbrnr commented Aug 1, 2014

OK, here is a new version of the LDA class with (optional) shrinkage. The one in #3105 was already very messy, so I thought I'd start from scratch. By default, nothing has changed and the old SVD-based code is used.

@ogrisel
Copy link
Member

ogrisel commented Aug 1, 2014

This PR will need to be rebased on top of the current master and the input validation need to be updated to use the new utilities, see:

Also as @larsmans said in the previous PR the narrative doc in ./doc/modules/lda_qda.rst needs to be updated to highlight this new option.

Finally it would be great to write and example in the examples folder to plot the impact of using shrinkage to improve performance accuracy and use that plot in the narrative documentation.

Once all this change are made please change the title of this PR from [WIP] to [MRG] for final review.

@ogrisel ogrisel changed the title New LDA class with shrinkage [WIP] New LDA class with shrinkage Aug 1, 2014
@mbillingr
Copy link
Contributor

@cle1109 I've updated examples/plot_lda.py from the old PR to use the new LDA API: 2ef847f.
You can pull it from the slda branch in my fork.

@cbrnr cbrnr changed the title [WIP] New LDA class with shrinkage [MRG] New LDA class with shrinkage Aug 5, 2014
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 5a18f80 on cle1109:slda into * on scikit-learn:master*.

@cbrnr
Copy link
Contributor Author

cbrnr commented Aug 5, 2014

OK, I think this should be it, everything should be done.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling e610f87 on cle1109:slda into * on scikit-learn:master*.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 64dc88b on cle1109:slda into * on scikit-learn:master*.

Shrinkage LDA can be used by setting the ``use_covariance`` parameter of the
:class:`lda.LDA` class to 'ledoitwolf'. This automatically determines the
optimal shrinkage parameter in an analytic way following the lemma introduced by
Ledoit and Wolf.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add ref here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forget it. It's right below. my bad

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 807c83f on cle1109:slda into * on scikit-learn:master*.

# centered group data
Xgc = Xg - meang
Xc.append(Xgc)
if self.use_covariance is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would extract the two solvers as private functions (or private methods if you need access to the object).

@cbrnr
Copy link
Contributor Author

cbrnr commented Aug 12, 2014

OK, so here's my take on the discussions in #3500 and #3105 (might I suggest to move it here, because here's where our latest code is).

I really like the idea of introducing solver and alpha parameters. The default value for solver could be SVD (the current implementation with the bug fixed in the transform method). The other option could be covariance. The alpha parameter should work in both cases, but the value ledoit_wolf makes only sense in the latter.

I support the idea of introducing the possibility to optimize the parameter via cross-validation.

The other question is how close to the textbook we want to be in the covariance method. As @kazemakase pointed out, we don't need to solve the eigenvalue problem for classification. That's also how most algorithms are implemented, because inverting the between scatter matrix could be a problem (see also the discussion at Cross Validated).

So in principle we could go ahead and start implementing it according to all suggestions here. The only thing that needs to be clarified first is how we do both classification and transform in the covariance method.

@mblondel mblondel changed the title [MRG] New LDA class with shrinkage [WIP] LDA refactoring Aug 12, 2014
@mblondel
Copy link
Member

You need some sort of eigen/singular value decomposition for the transform, but you can fit the classifier with a matrix inverse or an equation solver.

Instead of introducing a fit_transform option as I previously suggested, I think we could introduce two solvers covariance-lsqr and covariance-eigen. The transform method would only be available if covariance-eigen was used.

We could also add a covariance-cg solver which uses conjugate gradient instead of lsqr for solving the least squares problem / system of equations. This should make it possible to solve the problem without materializing the covariance matrix in memory (see the _solve_sparse_cg function in the Ridge module). This can of course be addressed in another PR.

(I took the liberty to change the title of this PR and add a TODO list.)

@mblondel
Copy link
Member

Actually, we can drop the covariance- prefix. This way, the names will be consistent with the Ridge module.

@mblondel
Copy link
Member

Just svd. The fact that lsqr and eigen use the covariance matrix can be documented in the docstring. In the Ridge module, we give a short description of each solver in the docstring:
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/linear_model/ridge.py#L215.

@cbrnr
Copy link
Contributor Author

cbrnr commented Aug 12, 2014

OK! (I deleted my comment because I saw it in the todo list at the top)

@mbillingr
Copy link
Contributor

i got a comment on one of the items in the todo list:

test that solvers return the same results (up to numerical errors)

Numerical errors can blow up and cause big differences in classification results if feature space is high dimensional. That is because these errors cause a small amount of uncertainty in the orientation of the hyperplane. The volume of this uncertainty increases exponentially with dimensionality. The bigger this volume the higher the chance of unseen samples lying on opposing sides of the hyperplane in different implementations.

In short, classification results can differ noticeably due to numerical errors. Something to keep in mind when designing the tests.


Parameters
----------
n_components: int
n_components : int
Number of components (< n_classes - 1) for dimensionality reduction

priors : array, optional, shape = [n_classes]
Priors on classes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring for intercept_ is missing. This is an array of size n_classes.

@cbrnr
Copy link
Contributor Author

cbrnr commented Dec 16, 2014

@agramfort, I am very confident now that the code does what it is supposed to do :-). The minor issue of differently scaled scalings_ should be addressed in a follow-up PR (it doesn't affect the functionality, but for consistency it would be nice if the same coefficients are obtained by both solvers. What do you think (in particular @ogrisel)?

@amueller
Copy link
Member

I think we should merge this. Maybe squash the commits, there are quite a few.

@cbrnr
Copy link
Contributor Author

cbrnr commented Dec 17, 2014

Do you want me to squash the commits (to only one?) - or will you do that?

@agramfort
Copy link
Member

@amueller?

I don't mind pushing these commits as they are... let's merge this ASAP.

@GaelVaroquaux
Copy link
Member

I don't mind pushing these commits as they are... let's merge this ASAP.

+1

@amueller
Copy link
Member

Ok, it looks like we don't do squashing any more... I felt that was helpful for bugfix releases but ok....

amueller added a commit that referenced this pull request Dec 19, 2014
[MRG+1] LDA refactoring
@amueller amueller merged commit 1aa5a9f into scikit-learn:master Dec 19, 2014
@GaelVaroquaux
Copy link
Member

I felt that was helpful for bugfix releases but ok....

I agree with you. I think that we should encourage it. But it don't feel
that it is mandatory.

@agramfort
Copy link
Member

🍻 @cle1109 !

@cbrnr
Copy link
Contributor Author

cbrnr commented Dec 20, 2014

😄 - awesome! Thank you guys, I really learned a lot while working on this PR. Looking forward to contributing more soon!

@amueller
Copy link
Member

This fixes #1649, right? Or only for the new solvers?

@agramfort
Copy link
Member

I think so

@amueller
Copy link
Member

@agramfort is that to "fixed #1649" or to "only for the new solvers"?

@amueller amueller mentioned this pull request Oct 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants