AverageRegressor? #10743

amueller · 2018-03-02T00:21:32Z

Should we add the regressor equivalent of VotingClassifier, which would just be computing averages? (I vaguely remember seeing that somewhere but now couldn't find issue or PR)

mohamed-ali · 2018-03-02T05:12:35Z

@amueller, I'd like to work on this issue, if no other PR is available.

mohamed-ali · 2018-03-02T06:26:00Z

@amueller I think the most similar to this new AverageRegressor is the baggingRegressor. (http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.BaggingRegressor.html)

The baggingRegressor fits the base regressors on a random subset of the data while using the same base_estimator :

A Bagging regressor is an ensemble meta-estimator that fits base regressors each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final prediction.

The new AverageRegressor is a bit different, in that it takes a list of estimators and fits each of them on the whole training set (instead of a random subset). Other than that, I guess both are similar in principle.

jnothman · 2018-03-02T07:25:29Z

Is this something that's done much? The soft voting makes VotingClassifier much more powerful if the models have different biases. I think we'd be better off merging the StackingTransformer (#8960) unless this is well justified by the literature.

mohamed-ali · 2018-03-02T07:43:17Z

@jnothman, I see Averaging frequently in Kaggle competitions as one of the ensembling techniques, however stacking has proven to be more valuable than simple averaging. I think we can consider averaging a corner case of stacking where instead of using a new estimator to aggregate the predictions of previous estimators we use a simple function sum(all_y_hat)/n_estimators.

I guess the argument for adding AverageRegressor as a separate regressor is to keep consistency with other API as VotingClassifier and, also, to cover the most famous ensembling techniques that are currently used.

mohamed-ali · 2018-03-02T13:41:45Z

As far as I know, there are four categories of ensembling techniques. For each of them, sklearn implements the following:

Voting/Averaging: VotingClassifier for voting but none for regression (averaging).
Bagging: BaggingRegressor, BaggingClassifier
Stacking (or blending): the aformentioned pull request [MRG+1] Stacking classifier with pipelines API #8960 will introduce it.
Boosting: GradientBoostingClassifier, GradientBoostingRegressor

So, I think that it's relevant to implement the AverageRegressor for the sake of completeness.

agramfort · 2018-03-04T21:02:04Z

if there is not other existing object to do this I think it's useful. It is at least as useful than VotingClassifier

mohamed-ali · 2018-03-05T16:17:03Z

Can I start working on a PR for this, or should I wait until consensus is reached?

jnothman · 2018-03-05T21:04:43Z

I think it would likely be merged

mohamed-ali · 2018-03-25T20:39:58Z

@amueller @jnothman @agramfort, I understand that the decision hasn't been made yet, but I thought it might be useful to have a concrete PR in case we decide in favor of including ensemble.AverageRegressor.

The work has been pushed here: #10868.

I am looking forward to see your reviews.

stsouko · 2018-11-03T19:09:30Z

I forked #10868.
@mohamed-ali's code refactored.
todo: user guide. will be soon.

stsouko · 2018-11-19T14:18:15Z

Code is ready to merge.
@amueller, can you review pr #12513?

mohamed-ali mentioned this issue Mar 25, 2018

[WIP] New Feature: Add AverageRegressor #10868

Closed

6 tasks

stsouko mentioned this issue Nov 3, 2018

[MRG] New Feature: VotingRegressor #12513

Merged

6 tasks

jnothman closed this as completed in #12513 Apr 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AverageRegressor? #10743

AverageRegressor? #10743

amueller commented Mar 2, 2018 •

edited

Loading

mohamed-ali commented Mar 2, 2018

mohamed-ali commented Mar 2, 2018 •

edited

Loading

jnothman commented Mar 2, 2018 via email

mohamed-ali commented Mar 2, 2018 •

edited

Loading

mohamed-ali commented Mar 2, 2018

agramfort commented Mar 4, 2018 via email

mohamed-ali commented Mar 5, 2018 •

edited

Loading

jnothman commented Mar 5, 2018 via email

mohamed-ali commented Mar 25, 2018 •

edited

Loading

stsouko commented Nov 3, 2018

stsouko commented Nov 19, 2018

AverageRegressor? #10743

AverageRegressor? #10743

Comments

amueller commented Mar 2, 2018 • edited Loading

mohamed-ali commented Mar 2, 2018

mohamed-ali commented Mar 2, 2018 • edited Loading

jnothman commented Mar 2, 2018 via email

mohamed-ali commented Mar 2, 2018 • edited Loading

mohamed-ali commented Mar 2, 2018

agramfort commented Mar 4, 2018 via email

mohamed-ali commented Mar 5, 2018 • edited Loading

jnothman commented Mar 5, 2018 via email

mohamed-ali commented Mar 25, 2018 • edited Loading

stsouko commented Nov 3, 2018

stsouko commented Nov 19, 2018

amueller commented Mar 2, 2018 •

edited

Loading

mohamed-ali commented Mar 2, 2018 •

edited

Loading

mohamed-ali commented Mar 2, 2018 •

edited

Loading

mohamed-ali commented Mar 5, 2018 •

edited

Loading

mohamed-ali commented Mar 25, 2018 •

edited

Loading