Skip to content

[WIP] Added PredictionTransformer and ThresholdClassifier #6663

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

[WIP] Added PredictionTransformer and ThresholdClassifier #6663

wants to merge 1 commit into from

Conversation

betatim
Copy link
Member

@betatim betatim commented Apr 14, 2016

Together these two classes can be used in a pipeline
to change the classification threshold from the default
of 0.5 to any value

This is a start on #4813

Is pipeline.py the right home for this?

  • documentation
  • example
  • tests
  • checks and robustness in the estimators

edit: from comment below:

Err also is PredictionTransformer the same as VotingClassifier by any chance? ;)

Yes it is, should ditch PredictionTransformer from this PR.


/cc @joshlk

Together these two classes can be used in a pipeline
to change the classification threshold from the default
of 0.5 to any value
@amueller
Copy link
Member

Is there a case were you want one and not the other?
I think using them together is a common use case, and so I would try to make that easy.
Maybe make these two classes private and add a third that does the whole thing?
Or just add the third class?
Removing public interfaces is hard, adding them is easy.

@betatim
Copy link
Member Author

betatim commented Apr 15, 2016

Stacking is the one use case I had in mind. Train several classifiers, take each of their outputs and feed them into another model to "combine" them. I know this from HEP work (eg http://arxiv.org/abs/0903.0850 (p6 of the PDF), I'm sure there are more these guys were just the first I could find the paper for). Might be something people do in kaggle competitions??

Not sure it is "worth it" when taking into account your comment on public interfaces and removing them. Having two private classes with a third, public one, that does it all in one step sounds like a good move.

@betatim
Copy link
Member Author

betatim commented Apr 18, 2016

Is there a way to mark my classifier as "two class only" for the common checks? Alternatively does someone have a good idea what ThresholdClassifier should do in a three (or more) class situation?

Related to the stacking use case: #6674

@betatim
Copy link
Member Author

betatim commented May 1, 2016

@amueller any idea how to mark ThresholdClassifier as "two class only"/deal with several of the common tests that feed three class problems to the classifier?

A good idea how to support multi class for ThresholdClassifier would be welcome, so far I could not think anything up.

@amueller
Copy link
Member

Err also is PredictionTransformer the same as VotingClassifier by any chance? ;)

@amueller
Copy link
Member

we need this for two-class classifiers: #6599. For now, add it manually to one of the skip lists. Maybe just skip the multi-class tests?

@betatim
Copy link
Member Author

betatim commented Oct 11, 2016

Err also is PredictionTransformer the same as VotingClassifier by any chance? ;)

Aha! Yes, because I didn't read the docs for VotingClassifier to realise it has a transform(). D'oh.

@amueller
Copy link
Member

well, I knew it but still needed to reread the previous discussion to see that that was the case lol.

Base automatically changed from master to main January 22, 2021 10:49
@betatim betatim closed this by deleting the head repository Sep 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants