Skip to content

[MRG] ENH: multi-output support for BaggingRegressor #8547

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

dohmatob
Copy link
Contributor

@dohmatob dohmatob commented Mar 6, 2017

Closes #8993, #3449, #4848. This minimalistic patch adds multi-output support to BaggingRegressor.

@dohmatob dohmatob force-pushed the general-bagging branch 2 times, most recently from f8f4c27 to a178084 Compare March 7, 2017 07:28
Copy link
Member

@raghavrv raghavrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice... Also could you add a whatsnew...

boston.target,
random_state=rng)

reg1 = BaggingRegressor(base_estimator=Ridge(), n_estimators=1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test for n_estimators>1?

Copy link
Contributor Author

@dohmatob dohmatob Jun 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya you're absolutely right. n_estimators=1 makes no sense within a test called "bagging" :). Fixed.

@@ -281,7 +283,7 @@ def _fit(self, X, y, max_samples=None, max_depth=None, sample_weight=None):
random_state = check_random_state(self.random_state)

# Convert data
X, y = check_X_y(X, y, ['csr', 'csc'])
X, y = check_X_y(X, y, ['csr', 'csc'], multi_output=self.multi_output)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why we can't just use multi_output=True

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If multioutput is false, we don't want y to be having more than one columns no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, @johannah I don't see why you you'd rather not have the multi_output=True in that func call.

Copy link
Contributor Author

@dohmatob dohmatob Jun 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After second thoughts, i think @jnothman has a point. We can make BaggingRegressor support multi-output targets by simply

  • Forcing multi_output=True in the call check_X_y(...)
  • Skipping the _validate_y(...) thing. I guess the question should we really be skipping this ?

@raghavrv raghavrv changed the title ENH: multi-output support for BaggingRegressor [MRG] ENH: multi-output support for BaggingRegressor Mar 7, 2017
@jnothman
Copy link
Member

jnothman commented Mar 7, 2017 via email

@dohmatob dohmatob force-pushed the general-bagging branch 2 times, most recently from 82973cc to 13ef011 Compare March 24, 2017 14:36
@dohmatob dohmatob force-pushed the general-bagging branch 3 times, most recently from 86e1d16 to 2bbf3ab Compare June 6, 2017 10:25
@jnothman
Copy link
Member

jnothman commented Jun 6, 2017

I don't think we need a classifier parameter to make this work, do we?

@dohmatob
Copy link
Contributor Author

dohmatob commented Jun 8, 2017

@johannah: I don't get your question, please could you be more precise.

@dohmatob dohmatob force-pushed the general-bagging branch 6 times, most recently from 8e68e4e to 168eaf3 Compare June 8, 2017 18:27
@jnothman
Copy link
Member

jnothman commented Jun 8, 2017 via email

@dohmatob dohmatob force-pushed the general-bagging branch 2 times, most recently from e3acfa4 to 7b9b39f Compare June 10, 2017 11:39
@@ -843,6 +843,9 @@ Trees and ensembles
:class:`ensemble.VotingClassifier` to fit underlying estimators in parallel.
:issue:`5805` by :user:`Ibraim Ganiev <olologin>`.

- :class:`ensemble.BaggingRegressor` now supports multi-output targets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to move to the v0.19 section.

@@ -390,8 +390,9 @@ def _set_oob_score(self, X, y):
"""Calculate out of bag predictions and score."""

def _validate_y(self, y):
# Default implementation
return column_or_1d(y, warn=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there should still be a check_array call here. And if not, we should delete this method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this might be right, since the validation is done by the downstream estimator.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've had several contributions of multi-output support for bagging. I think keeping it minimal here (not trying to do classification at the same time) is a good idea.

@@ -390,8 +390,9 @@ def _set_oob_score(self, X, y):
"""Calculate out of bag predictions and score."""

def _validate_y(self, y):
# Default implementation
return column_or_1d(y, warn=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this might be right, since the validation is done by the downstream estimator.

reg1.fit(X_train, y_train)
for n_targets in [1, 2]:
y_train_ = np.ndarray((len(y_train), n_targets))
y_train_.T[:] = y_train.copy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm finding this hard to understand.

@amueller
Copy link
Member

What's the status on this?

@stefanmeili
Copy link

I'd also really like to see this implemented.

@jlnzhou
Copy link

jlnzhou commented Mar 2, 2021

Hi!
This PR is stalled, can I try to work on this issue if it is suitable for a new contributor?

@adrinjalali
Copy link
Member

Closing as implemented in #9707

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BaggingRegressor and BaggingClassifier should support multi-output targets
8 participants