Skip to content

FIX Fix RandomForestRegressor doesn't accept max_samples=1.0 #20156 #20159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Jun 4, 2021

Conversation

murata-yu
Copy link
Contributor

@murata-yu murata-yu commented May 28, 2021

Reference Issues/PRs

Fixes #20156

What does this implement/fix? Explain your changes.

fix max_samples range for (0, 1) to (0, 1]

Any other comments?

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good. Could you please add a test to check that this edge is accepted? Test that the result of:

RandomForestRegressor(max_samples=1.0, random_state=0).fit(X_train, y_train).predict_proba(X_test)

is the same:

RandomForestRegressor(max_samples=None, random_state=0).fit(X_train, y_train).predict_proba(X_test)

@jeremiedbb
Copy link
Member

Thanks @murata-yu ! Please add a what's new entry.

@murata-yu
Copy link
Contributor Author

Hi, there. I add tests on 3f8a2a6.

I have some questions to discuss.

  • FOREST_CLASSIFIERS have predict_proba, but FOREST_REGRESSORS don't have predict_proba. So I compared MSE of predict and truth. Is this OK?
  • I couldn't come up with a good name for the tests, so please let me know if you have a good idea.

@murata-yu murata-yu requested a review from ogrisel May 31, 2021 01:42
@ogrisel
Copy link
Member

ogrisel commented May 31, 2021

FOREST_CLASSIFIERS have predict_proba, but FOREST_REGRESSORS don't have predict_proba. So I compared MSE of predict and truth. Is this OK?

Yes!

I couldn't come up with a good name for the tests, so please let me know if you have a good idea.

I think the test names are fine.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just a small comment. Also please document the change in doc/whats_new/v1.0.rst.

@murata-yu
Copy link
Contributor Author

I documented the change in doc/whats_new/v1.0.rst on 7a15e41.

Copy link
Contributor

@cmarmo cmarmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @murata-yu , sphinx is complaining about the syntax in the what's new file. This will fix the warning. Thanks for your work so far.

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the above comment, LGTM. Thank you very much for the PR!

@murata-yu murata-yu requested a review from ogrisel June 1, 2021 10:21
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @murata-yu !

murata-yu and others added 8 commits June 3, 2021 12:18
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
@murata-yu murata-yu requested a review from thomasjpfan June 3, 2021 04:28
Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the updates @murata-yu !

I left a small comment regarding the whats new. Otherwise this PR looks good to me.

@thomasjpfan thomasjpfan changed the title [MRG] Fix RandomForestRegressor doesn't accept max_samples=1.0 #20156 FIX Fix RandomForestRegressor doesn't accept max_samples=1.0 #20156 Jun 4, 2021
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>
@thomasjpfan thomasjpfan merged commit a1a6b3a into scikit-learn:main Jun 4, 2021
thomasjpfan added a commit to thomasjpfan/scikit-learn that referenced this pull request Jun 8, 2021
* TST enable test docstring params for feature extraction module (scikit-learn#20188)

* DOC fix a reference in sklearn.ensemble.GradientBoostingRegressor (scikit-learn#20198)

* FIX mcc zero divsion  (scikit-learn#19977)

* TST Add TransformedTargetRegressor to test_meta_estimators_delegate_data_validation (scikit-learn#20175)

Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>

* TST enable n_feature_in_ test for feature_extraction module

* FIX Uses points instead of pixels in plot_tree (scikit-learn#20023)

* MNT n_features_in through the multiclass module (scikit-learn#20193)

* CI Removes python 3.6 builds from wheel building (scikit-learn#20184)

* FIX Fix typo in error message in `fetch_openml` (scikit-learn#20201)

* FIX Fix error when using Calibrated with Voting (scikit-learn#20087)

* FIX Fix RandomForestRegressor doesn't accept max_samples=1.0 (scikit-learn#20159)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

* ENH Adds Poisson criterion in RandomForestRegressor (scikit-learn#19836)

Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Co-authored-by: Alihan Zihna <alihanz@gmail.com>
Co-authored-by: Alihan Zihna <a.zihna@ckhgbdp.onmicrosoft.com>
Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>
Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com>
Co-authored-by: naozin555 <37050583+naozin555@users.noreply.github.com>
Co-authored-by: Venkatachalam N <venky.yuvy@gmail.com>
Co-authored-by: Thomas J. Fan <thomasjpfan@gmail.com>

* TST Replace assert_warns from decomposition/tests (scikit-learn#20214)

* TST check n_features_in_ in pipeline module (scikit-learn#20192)

Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com>

* Allow `n_knots=None` if knots are explicitly specified in `SplineTransformer` (scikit-learn#20191)


Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>

* FIX make check_complex_data deterministic (scikit-learn#20221)

* TST test_fit_docstring_attributes include properties (scikit-learn#20190)

* FIX Uses the color max for colormap in ConfusionMatrixDisplay (scikit-learn#19784)

* STY Changing .format method to f-string formatting (scikit-learn#20215)

* CI Adds permissions for label action

Co-authored-by: Jérémie du Boisberranger <34657725+jeremiedbb@users.noreply.github.com>
Co-authored-by: tsuga <2888173+tsuga@users.noreply.github.com>
Co-authored-by: Conner Shen <connershen98@hotmail.com>
Co-authored-by: Guillaume Lemaitre <g.lemaitre58@gmail.com>
Co-authored-by: mlondschien <61679398+mlondschien@users.noreply.github.com>
Co-authored-by: Clément Fauchereau <clement.fauchereau@ensta-bretagne.org>
Co-authored-by: murata-yu <67666318+murata-yu@users.noreply.github.com>
Co-authored-by: Olivier Grisel <olivier.grisel@ensta.org>
Co-authored-by: Brian Sun <52805678+bsun94@users.noreply.github.com>
Co-authored-by: Christian Lorentzen <lorentzen.ch@gmail.com>
Co-authored-by: Alihan Zihna <alihanz@gmail.com>
Co-authored-by: Alihan Zihna <a.zihna@ckhgbdp.onmicrosoft.com>
Co-authored-by: Chiara Marmo <cmarmo@users.noreply.github.com>
Co-authored-by: Olivier Grisel <olivier.grisel@gmail.com>
Co-authored-by: naozin555 <37050583+naozin555@users.noreply.github.com>
Co-authored-by: Venkatachalam N <venky.yuvy@gmail.com>
Co-authored-by: Nanshan Li <nanshanli@dsaid.gov.sg>
Co-authored-by: solosilence <abhishekkr23rs@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RandomForestRegressor doesn't accept max_samples=1.0
5 participants