[WIP] Add new feature StackingClassifier #7427

yl565 · 2016-09-14T22:40:23Z

PR to #4816. This is a continuation of #6674.

To-do:

Simple example
More tests
Add transform method
update with _BaseComposition after [MRG+1] Change VotingClassifier estimators by set_params #7674 is merged

This change is

…redict_log_proba'

No cross-validation will be performed when cv=1

When cv=1 (no cross-validation), fit one sub-estimator per job. Otherwise use n_jobs option in `cross_val_predict`.

yl565 · 2016-09-23T01:36:26Z

Stacking classifier implemented. All suggestions are welcomed. I'm also considering adding the ability to set estimators in set_params similar to #7288
@MechCoder You mentioned there are questions regarding API?

amueller

This looks nice. Can you add an illustrative example?

amueller · 2016-09-30T21:07:30Z

sklearn/ensemble/stacking_classifier.py

+
+
+class StackingClassifier(BaseEstimator, ClassifierMixin):
+    """ Stacking classifier for combining unfitted estimators


I know what you mean by unfitted but I feel it is a bit awkward here and in the next sentence.

amueller · 2016-09-30T21:08:50Z

sklearn/ensemble/stacking_classifier.py

+
+        For integer/None inputs, if the estimator is a classifier and ``y`` is
+        either binary or multiclass, :class:`StratifiedKFold` is used. In all
+        other cases, :class:`KFold` is used.


In all other cases? If y is a different format, I'd say

amueller · 2016-09-30T21:09:09Z

sklearn/ensemble/stacking_classifier.py

+        -------
+        self : object
+        """
+        if isinstance(y, np.ndarray) and len(y.shape) > 1 and y.shape[1] > 1:


maybe use _type_of_target? not sure

Could you please explain to me what do you mean by _type_of_target?

sklearn.utils.multiclass.type_of_target

amueller · 2016-09-30T21:09:44Z

sklearn/ensemble/stacking_classifier.py

+            raise NotImplementedError('Multilabel and multi-output'
+                                      ' classification is not supported.')
+
+        if self.estimators is None or len(self.estimators) == 0:


How about if not self.estimators? Also, please add a got {} to the error.

amueller · 2016-09-30T21:10:08Z

sklearn/ensemble/stacking_classifier.py

+
+        if not is_classifier(self.meta_estimator):
+            raise AttributeError('Invalid `meta_estimator` attribute, '
+                                 '`meta_estimator` should be a classifier')


maybe also print the class name?

amueller · 2016-09-30T21:10:47Z

sklearn/ensemble/stacking_classifier.py

+        The meta-estimator to combine the predictions of each individual
+        estimator
+
+    method : string, optional, default='predict_proba'


I think "default" should be "auto" which is predict_proba if it exists and decision_function otherwise.

amueller · 2016-09-30T21:11:23Z

sklearn/ensemble/stacking_classifier.py

+                    raise ValueError('Underlying estimator `{0}` does not '
+                                     'support `{1}`.'.format(name, param))
+
+        self.le_ = LabelEncoder()


self.le_ = LabelEncoder().fit(y) ?

amueller · 2016-09-30T21:13:12Z

sklearn/ensemble/stacking_classifier.py

+        self.meta_estimator_.fit(scores, transformed_y, **kwargs)
+        return self
+
+    def _form_meta_inputs(self, predicted):


maybe this doesn't need to be a separate method? maybe just a loop in _est_predict or something?

yl565 · 2016-10-10T03:01:45Z

@amueller I have updated the codes to reflect your suggestions. Though I figured _form_meta_inputs is still needed for transforming the cross-validated scores in this line

yl565 · 2016-10-17T22:44:45Z

@amueller, do you mean show an example of its usage including the graphs in this issue's conversation?

amueller · 2016-10-17T22:52:52Z

I mean an example to show how to use it and what it does. I haven't really put any thought into it ;)

yl565 · 2016-11-17T11:21:59Z

@jnothman Should I update this PR with _BaseComposition after #7674 merged? Or should I open a new PR after this PR merged?

jnothman · 2016-11-17T13:55:39Z

Sure you can update it with _BaseComposition after #7674 is merged

On 17 November 2016 at 22:22, Yichuan Liu notifications@github.com wrote:

@jnothman https://github.com/jnothman Should I update this PR with
_BaseComposition after #7674
#7674 merged? Or
should I open a new PR after this PR merged?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#7427 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz6-Uc8WwqS5ogFN4oaWHUUFgNvpMVks5q_DjYgaJpZM4J9Uqg
.

ivallesp · 2016-11-20T15:05:03Z

Hi,
As I wrote in #6674, I would like to colaborate on this given my experience in Kaggle.

I have been reviewing the work done by @yl565 and it is impressive. I really like how it is organized and I think it is a really well done work. However, I would like to add one functionality which may be key of the implementation: some problems, specially Kaggle ones, require to train thousands of classifiers; that's why I think that the current implementation is a bit monolithic. It would be nice being able to generate the train and test meta-predictors separately in order to be able to store them to disk and retrieve them in the future to either, create a next layer, or combine them using one meta-model.

So, what I mean is that, for example, if the parameter meta_estimator of the StackingClassifier class is null, the fit method would return an object with an attribute containing a matrix with all the training metapredictors, and the predict method would return a matrix with all the predictions of the models trained using the whole training set. In both cases it would be done in the same column order, assuring a correct link between datasets.

What do you think about it? Does it make sense to you? If so, I can help developing that funcionality.

Best,
Iván

yl565 · 2016-11-20T16:44:07Z

@ivallesp, I'm not sure I completely understand what you have in mind but it seems to me the predict method you are thinking about is transform in sklearn convention? I think we could add something like add_estimator(self, estimators, is_trained) and delete_estimator(self, estimator_names) to achieve re-using trained sub-estimators. I'm not sure how you could save a trained estimator on disc though...

jnothman · 2016-11-20T21:24:43Z

Hi Iván,

Are you mostly talking about collapsing, for the sake of prediction, a set
of predictors, when they only involve matrix multiplication?

On 21 November 2016 at 03:44, Yichuan Liu notifications@github.com wrote:

@ivallesp https://github.com/ivallesp, I'm not sure I completely
understand what you have in mind but it seems to me the predict method
you are thinking about is transform in sklearn convention? I think we
could add something like add_estimator(self, estimators, is_trained) and delete_estimator(self,
estimator_names) to achieve re-using trained sub-estimators. I'm not sure
how you could save a trained estimator on disc though...

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#7427 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz60ZhOPkxYCtDVAy4hSPwKasUugSUks5rAHjYgaJpZM4J9Uqg
.

ivallesp · 2016-11-21T00:05:53Z

Sorry, it seems I was not very clear.

Stacked generalization is based in generating, using a list of models, training metapredictors by generating out-of-fold predictions using k-fold and then stacking them into vectors (meta-features; one per model) with the same length as the training set. That is what cross_val_predict method does. The second step consists on training the models using the whole training set to generate test metapredictors. On top of this, a meta model is trained in order to intelligently combine the meta-predictors in a more powerful prediction which theoretically, at least will be as bad as the best of your predictions.

What I mean is that it would be really interesting to be able to, once trained several models and generated the new training set and test set composed of meta-predictors, retrieve these matrices (or datasets) in order to be able to treat them in a different way. I mean, being able to stop just before applying the meta-model. That way, the user would be able to store these meta-predictors for the training and the test set in order to, for example, build a new stacked generalization (a new layer) on top of this. Another example would be appending the meta-predictors to the original training and test set and build a model that combines the meta-predictors and the original features.

In addition I would remark that sometimes it may be useful for, for instance, predict a transformed target variable; for example, in the case of a skewed target variable in a regression problem, you can build a stacker using 20 models with the original target variable, 20 more with the log of the target variable, and 20 more using the Box-Cox of it. That way the user would be able to combine these 60 meta-predictors and build a meta-model that combines them. For that, we would need to get the training-set associated metapredictors and the test-set associated metapredictors.

Am I now being clearer? If not, or not completely, please, do not hesitate to let me know and I would try to add more examples.

jnothman · 2016-11-21T01:38:37Z

Right. I'm not able to give this enough attention to familiarise myself
better with the techniques you are suggesting, but I'm interested in
identifying an API that provides maximum flexibility while keeping it
simple. One option, as Yichuan suggests, is to have a way to do the stacked
classifier learning process, but then provide a transform that bypasses
the metaestimator so that you can put it in any pipeline context. I think
collapsing multiple estimators into one fast prediction, if you were ever
suggesting that, might be something for version 2.

On 21 November 2016 at 11:05, Iván Vallés notifications@github.com wrote:

Sorry, it seems I was not very clear.

Stacked generalization is based in generating, using a list of models,
training metapredictors by training and predicting out-of-fold predictions
using k-fold and then stacking them into vectors (meta-features; one per
model) with the same length as the training set. That is what
cross_val_predict method does. The second step consists on training the
models using the whole training set to generate test metapredictors. On top
of this, a meta model is trained in order to intelligently combine the
meta-predictors in a more powerful prediction which theoretically, at least
will be as bad as the best of your predictions.

What I mean is that it would be really interesting to be able to, once
trained several models and generated the new training set and test set
composed of meta-predictors, retrieve these matrices (or datasets) in order
to be able to treat them in a different way. I mean, being able to stop
just before applying the meta-model. That way, the user would be able to
store these meta-predictors for the training and the test set in order to,
for example, build a new stacked generalization (a new layer) on top of
this.

In addition I would remark that sometimes it may be useful for, for
instance, predict a transformed target variable; for example, in the case
of a skewed target variable in a regression problem, you can build a
stacker using 20 models with the original target variable, 20 more with the
log of the target variable, and 20 more using the Box-Cox of it. That way
the user would be able to combine these 60 meta-predictors and build a
meta-model that combines them. For that, we would need to get the
training-set associated metapredictors and the test-set associated
metapredictors.

Am I now being clearer? If not, or not completely, please, do not hesitate
to let me know and I would try to add more examples.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#7427 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz62-eiaf9UqlrigV3kOvPXVGH-ErCks5rAOBjgaJpZM4J9Uqg
.

yl565 · 2016-11-21T02:59:06Z

I will add the transform method, could be useful for someone.

ivallesp · 2016-11-21T08:09:52Z

thank you!

yl565 · 2016-11-21T12:39:39Z

I'm thinking about something like transform(self, X, is_apply_meta=True) so that when is_apply_meta=True, the transform of the meta-estimator will be called if exist. Otherwise (or is_apply_meta=False), the output will be a matrix the columns of which are output from the sub-estimators. @jnothman, what's your opinion?

jnothman · 2016-11-21T13:16:49Z

I think it's fine to assume applying meta is false. Just describe the
transformation correctly. After all, that entire meta functionality can be
produced with a Pipeline.

On 21 November 2016 at 23:39, Yichuan Liu notifications@github.com wrote:

I'm thinking about something like transform(self, X, is_apply_meta=True)
so that when is_apply_meta=True, the transform of the meta-estimator will
be called if exist. Otherwise (or is_apply_meta=False), the output will
be a matrix the columns of which are output from the sub-estimators.
@jnothman https://github.com/jnothman, what's your opinion?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#7427 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAEz63kjyCp63cVVyEV3zam4V6LkqosZks5rAZEMgaJpZM4J9Uqg
.

jnothman · 2017-01-09T03:56:51Z

Add to your todo list: narrative documentation (in doc/) and an example in examples/ comparing voting classifier with a couple of stacking meta-classifiers (although the real boon here is that it can be used for regression too)

jnothman · 2017-01-09T03:59:51Z

We probably want a StackingClassifier and a StackingRegressor ~~(though I think we could in this case build them into one class...)~~

jnothman

You've not really tested the cv != 1 case.

There should also be a test that the estimators are cloned (i.e. the original inputs are unaffected).

Otherwise, this is looking pretty good!

jnothman · 2017-01-09T04:00:57Z

sklearn/ensemble/stacking_classifier.py

+        """
+        if any(s in type_of_target(y) for s in ['multilabel', 'multioutput']):
+            raise NotImplementedError('Multilabel and multi-output'
+                                      ' classification is not supported.')


jnothman · 2017-01-09T04:04:28Z

sklearn/ensemble/stacking_classifier.py

+        self.le_ = LabelEncoder().fit(y)
+        self.classes_ = self.le_.classes_
+
+        transformed_y = self.le_.transform(y)


I'm not certain why we need to do this.

jnothman · 2017-01-09T04:05:48Z

sklearn/ensemble/stacking_classifier.py

+        self.classes_ = self.le_.classes_
+
+        transformed_y = self.le_.transform(y)
+        if self.cv == 1:  # Do not cross-validation


this won't work if cv is an array, though that is unlikely.

jnothman · 2017-01-09T04:10:56Z

sklearn/ensemble/stacking_classifier.py

+                    delayed(_parallel_fit)(clone(clf),
+                                           X, transformed_y, kwargs)
+                    for _, clf in self.estimators)
+            scores = self._est_predict(X)


I don't feel scores is the best name. Perhaps y_pred or y_score or predictions or Xt

jnothman · 2017-01-09T04:14:37Z

sklearn/ensemble/stacking_classifier.py

+        self.meta_estimator_.fit(scores, transformed_y, **kwargs)
+        return self
+
+    def _form_meta_inputs(self, clf, predicted):


call this clean_scores or clean_predictions or whatever?

jnothman · 2017-01-09T04:24:37Z

sklearn/ensemble/tests/test_stacking_classifier.py

+
+
+def test_sample_weight():
+    """Tests sample_weight parameter of StackingClassifier"""


with nosetests, docstrings in test functions make test transcripts harder to read. make this a comment instead.

jnothman · 2017-01-09T04:24:40Z

sklearn/ensemble/tests/test_stacking_classifier.py

+
+
+def test_classify_iris():
+    """Check classification by majority label on dataset iris."""


with nosetests, docstrings in test functions make test transcripts harder to read. make this a comment instead.

jnothman · 2017-01-09T04:24:45Z

sklearn/ensemble/tests/test_stacking_classifier.py

+
+
+def test_predict_on_toy_problem():
+    """Manually check predicted class labels for toy dataset."""


with nosetests, docstrings in test functions make test transcripts harder to read. make this a comment instead.

jnothman · 2017-01-09T04:26:06Z

sklearn/ensemble/tests/test_stacking_classifier.py

+
+    y = np.array([1, 1, 1, 2, 2, 2])
+
+    assert_equal(all(clf1.fit(X, y).predict(X)), all([1, 1, 1, 2, 2, 2]))


do you mean assert_array_equal? all will return a boolean, so you're asserting the equality of booleans here.

jnothman · 2017-01-09T04:30:19Z

sklearn/ensemble/tests/test_stacking_classifier.py

+    eclf1 = StackingClassifier(
+        estimators=[('lr', clf1), ('rf', clf2), ('nb', clf3)],
+        meta_estimator=clfm
+    ).fit(X, y, sample_weight=np.ones((len(y),)))


In the cv=1 case, at least, it should be possible to test sample_weight as corresponding to a repetition of elements.

caioaao · 2017-03-04T15:51:32Z

maybe I'm late here, but here's my two cents:
stacking is already "hard": what it must do is help avoiding data leakage during training. this implementation looks like it's doing a lot of stuff (I just skimmed through the code, but I saw it's even label encoding stuff - I'm not sure it's a good idea to do this much stuff here).
an idea: the first layer in the stacking can be seen as a transformer: it'll receive a feature set and outputs a new feature set. each classifier being stacked is independent from the other from the same layer, so it looks like a perfect situation to use pipeline's API. the real trick to turn a pipeline into a stacking classifier is the blending, and that's what's missing from sklearn. there's no need to implement a class to do what pipeline API does, just a class to blend a classifier and make it suitable for use as a transformer.
I've implemented those ideas here: https://gist.github.com/caioaao/28bf77e9a95ae6b70b14141feacb1f84
It doesn't have tests and it's probably lacking some asserts to make it more robust, so it's not useful for a PR in sklearn, but it may be useful for comparison purposes

jnothman · 2017-03-04T23:26:21Z

thanks for the input, Caio. I agree that we're adding a lot of complexity here by providing a classifier rather than a transformer. I am not decided about whether we should require the user to construct a FeatureUnion. I think for convenience we should provide that functionality. But we should use cross_val_predict.

…

On 5 Mar 2017 2:51 am, "Caio Oliveira" ***@***.***> wrote: maybe I'm late here, but here's my two cents: stacking is already "hard": what it *must* do is help avoiding data leakage during training. this implementation looks like it's doing a lot of stuff (I just skimmed through the code, but I saw it's even label encoding stuff - I'm not sure it's a good idea to do this much stuff here). an idea: the first layer in the stacking can be seen as a transformer: it'll receive a feature set and outputs a new feature set. each classifier being stacked is independent from the other from the same layer, so it looks like a perfect situation to use pipeline's API. the real trick to turn a pipeline into a stacking classifier is the blending, and that's what's missing from sklearn. there's no need to implement a class to do what pipeline API does, just a class to blend a classifier and make it suitable for use as a transformer. I've implemented those ideas here: https://gist.github.com/caioaao/ 28bf77e9a95ae6b70b14141feacb1f84 It doesn't have tests and it's probably lacking some asserts to make it more robust, so it's not useful for a PR in sklearn, but it may be useful for comparison purposes — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7427 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz6y-IBpr3nKsScN5stQzKuh8LsXDtks5riYiFgaJpZM4J9Uqg> .

caioaao · 2017-03-05T15:57:58Z

@jnothman didn't knew about cross_val_predict. that actually makes things simpler. updated the gist to use it. there are, of course, some improvements that can be made (like being able to pass other parameters to cross_val_predict) https://gist.github.com/caioaao/28bf77e9a95ae6b70b14141feacb1f84

about requiring the user to construct a FeatureUnion: I really like a functional approach better and I think the code is clearer when you compose stuff using functions instead of creating new classes, but it doesn't stop you from, instead of using the make_stack_layer, writing a class that just uses FeatureUnion under the hood, instead of doing an ad-hoc implementation that in the end provides basically the same functionality. I'm actually against this choice and think make_stacking_classifier(stacked_estimators, meta_classifier) would be cleaner, but that's a design choice that's not really aligned with the rest of sklearn's api - an example is LassoCV, RidgeCV, etc

jnothman · 2017-03-05T20:15:12Z

I don't see the relevance of LassoCV at all. But yes, convenience is preferred to functional purity.

…

On 6 Mar 2017 2:58 am, "Caio Oliveira" ***@***.***> wrote: @jnothman <https://github.com/jnothman> didn't knew about cross_val_predict. that actually makes things simpler. updated the gist to use it. there are, of course, some improvements that can be made (like being able to pass other parameters to cross_val_precit) https://gist.github.com/caioaao/28bf77e9a95ae6b70b14141feacb1f84 about requiring the user to construct a FeatureUnion: I really like a functional approach better and I think the code is clearer when you compose stuff using functions instead of creating new classes, but it doesn't stop you from, instead of using the make_stack_layer, write a class that just uses FeatureUnion under the hood, instead of doing an ad-hoc implementation that in the end provides basically the same functionality. I'm actually against this choice and think make_stacking_classifier(stacked_estimators, meta_classifier) would be cleaner, but that's a design choice that's not really aligned with the rest of sklearn's api - an example is LassoCV, RidgeCV, etc — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7427 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz61b2sYu1sAfkIUNZYWBbfbcIhImhks5rituIgaJpZM4J9Uqg> .

caioaao · 2017-03-06T02:53:49Z

LassoCV was just an example of a class that just wraps a composition of two classes.
About convenienc over functional purity: maybe what I said was misinterpreted: what I meant with function composition is that make_stack_classifier would call make_feature_union and make_pipeline and return it. I don't see how clf = make_stack_classifier([RandomForest(), LinearSVC()], LogisticRegression()) would be less convenient than clf = StackClassifier([RandomForest(), LinearSVC()], LogisticRegression()), but then again, this is just a design preference and the former isn't so well aligned with the rest of sklearn's API :)

yl565 · 2017-03-08T18:06:15Z

@jnothman Since #7674 does not seems to be able to merge anytime soon, do you think its better to remove "update with _BaseComposition after #7674 is merged" from the to-do list so we can proceed with this PR

jnothman · 2017-03-08T20:35:36Z

well you can build on _BaseComposition in any case, Judy merging or copying it in here. I'm inclined towards making this a transformer.

…

On 9 Mar 2017 5:06 am, "Yichuan Liu" ***@***.***> wrote: @jnothman <https://github.com/jnothman> Since #7674 <#7674> does not seems to be able to merge anytime soon, do you think its better to remove "update with _BaseComposition after #7674 <#7674> is merged" from the to-do list so we can proceed with this PR — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7427 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz635pnfXZDFFbDm6ORq8RBS25FSCjks5rju4ZgaJpZM4J9Uqg> .

GaelVaroquaux · 2017-05-30T15:52:44Z

What's the status on this PR? I'll have some free time next week, and I was thinking of reviewing this PR.

caioaao · 2017-05-30T16:39:13Z

as this looks stale, I'd really like to have a shot at implementing it as I said before. if I can do it before the weekend, would you guys mind taking it into consideration before merging/choosing this one?

yl565 · 2017-05-30T23:21:37Z

Have been busy with thesis. I'll try working on it this week.
I still need to incorporate #7674 into this.
There is also the issue of whether or not to support Multilabel and multi-output classification which would make this PR more complicated.

jnothman · 2017-06-01T09:13:39Z

well if you'd like others to contribute some of those features to your branch, just say

…

On 31 May 2017 9:21 am, "Yichuan Liu" ***@***.***> wrote: Have been busy with thesis. I'll try working on it this week. I still need to incorporate #7674 <#7674> into this. There is also the issue of whether or not to support Multilabel and multi-output classification which would make this PR more complicated. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7427 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz65YO08Q1zi3YJh9JVDs4cN18XND7ks5r_KSEgaJpZM4J9Uqg> .

caioaao · 2017-06-01T11:51:26Z

I implemented my comments in #8960 . As I said there, it's ready to handle several types of estimators (not just classifiers) and the implementation is simpler.

jnothman · 2017-06-01T13:25:26Z

okay thanks.

…

On 1 Jun 2017 9:51 pm, "Caio Oliveira" ***@***.***> wrote: I implemented my comments in #8960 <#8960> . As I said there, it's ready to handle several types of estimators (not just classifiers) and the implementation is simpler. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7427 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAEz69TKDDrbBNbe9a0NmpBid8kTnNfhks5r_qW_gaJpZM4J9Uqg> .

AlJohri · 2017-10-03T21:21:22Z

hi @jnothman, I'm interesting in helping out with this PR. I've manually implemented a stacked classifier for my personal project several times at this point but haven't come up with a solution that lets me do GridSearchCV with the stacked classifier.

would anyone mind summarizing what's left to get this PR merged? is the to-do list at the top of the PR still accurate?

EDIT my mistake, I didn't see #8960 was more up to date

yl565 added 6 commits September 14, 2016 18:30

Add new feature StackingClassifier

a4702a4

Add tests, clip sub-estimator outputs to (-1e30, 1e30) when method='p…

27bf8ac

…redict_log_proba'

Move StackingClassifier to OTHER in utils.testing

1204536

Move StackingClassifier to DONT_TEST in utils.testing

c11cd5d

Add simple example. Add option cv=1

9a68a99

No cross-validation will be performed when cv=1

Fixed nested parallel problem

23676a1

When cv=1 (no cross-validation), fit one sub-estimator per job. Otherwise use n_jobs option in `cross_val_predict`.

yl565 changed the title ~~[WIP] Add new feature StackingClassifier~~ [MRG] Add new feature StackingClassifier Sep 23, 2016

amueller reviewed Sep 30, 2016

View reviewed changes

rth mentioned this pull request Oct 4, 2016

[RFC] Standardize parallel meta-estimators #7570

Closed

Use auto as the default method

5a028b1

yl565 added 4 commits October 9, 2016 23:08

Update tests

a4ef32c

Correct error message in test

7509c1e

Correct error message for PY 3.5

4a24183

Correct error message for PY 3.5

99e49de

amueller added the Waiting for Reviewer label Oct 10, 2016

yl565 changed the title ~~[MRG] Add new feature StackingClassifier~~ [WIP] Add new feature StackingClassifier Nov 17, 2016

yl565 mentioned this pull request Nov 20, 2016

Add stacking-meta-model #6674

Closed

jnothman reviewed Jan 9, 2017

View reviewed changes

jnothman mentioned this pull request Jan 11, 2017

[MRG+2] Classifier chain #7602

Merged

jnothman mentioned this pull request Mar 14, 2017

Create CommitteeRegressor() as analogue to VotingClassifier() in sklearn.ensemble #7555

Closed

caioaao mentioned this pull request May 31, 2017

[MRG+1] Stacking classifier with pipelines API #8960

Closed

7 tasks

glemaitre mentioned this pull request May 1, 2018

[MRG] FEA: Stacking estimator for classification and regression #11047

Merged

thomasjpfan added the Superseded PR has been replace by a newer PR label Aug 5, 2019

jnothman closed this in #11047 Sep 18, 2019



		class StackingClassifier(BaseEstimator, ClassifierMixin):
		""" Stacking classifier for combining unfitted estimators



		def test_sample_weight():
		"""Tests sample_weight parameter of StackingClassifier"""



		def test_classify_iris():
		"""Check classification by majority label on dataset iris."""



		def test_predict_on_toy_problem():
		"""Manually check predicted class labels for toy dataset."""


		y = np.array([1, 1, 1, 2, 2, 2])

		assert_equal(all(clf1.fit(X, y).predict(X)), all([1, 1, 1, 2, 2, 2]))

Uh oh!

[WIP] Add new feature StackingClassifier #7427

[WIP] Add new feature StackingClassifier #7427

Uh oh!

Conversation

yl565 commented Sep 14, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yl565 commented Sep 23, 2016

Uh oh!

amueller left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yl565 commented Oct 10, 2016

Uh oh!

yl565 commented Oct 17, 2016

Uh oh!

amueller commented Oct 17, 2016

Uh oh!

yl565 commented Nov 17, 2016

Uh oh!

jnothman commented Nov 17, 2016

Uh oh!

ivallesp commented Nov 20, 2016

Uh oh!

yl565 commented Nov 20, 2016

Uh oh!

jnothman commented Nov 20, 2016

Uh oh!

ivallesp commented Nov 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman commented Nov 21, 2016

Uh oh!

yl565 commented Nov 21, 2016

Uh oh!

ivallesp commented Nov 21, 2016

Uh oh!

yl565 commented Nov 21, 2016

Uh oh!

jnothman commented Nov 21, 2016

Uh oh!

jnothman commented Jan 9, 2017

Uh oh!

jnothman commented Jan 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnothman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yl565 commented Sep 14, 2016 •

edited

Loading

ivallesp commented Nov 21, 2016 •

edited

Loading

jnothman commented Jan 9, 2017 •

edited

Loading

caioaao commented Mar 5, 2017 •

edited

Loading

AlJohri commented Oct 3, 2017 •

edited

Loading