-
-
Notifications
You must be signed in to change notification settings - Fork 26.2k
BUG Fix zero division error in GBDTs #14024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -2352,8 +2352,8 @@ def check_decision_proba_consistency(name, estimator_orig): | |||
hasattr(estimator, "predict_proba")): | |||
|
|||
estimator.fit(X, y) | |||
a = estimator.predict_proba(X_test)[:, 1] | |||
b = estimator.decision_function(X_test) | |||
a = estimator.predict_proba(X_test)[:, 1].round(decimals=10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not great :-/ Can we avoid that? Basically you're saying after the "fix" the estimator is not consistent any more because of floating point issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because scipy's expit
isn't precise enough. But I feel like 1e-10 is a decent precision. I had issues with this test before and IMO it's too strict.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be rounding as a non-copying operation? Is this copying?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's copying.
I can't tell from the docs (nor the code) whether it copies if we pass out=
.
I don't think that's important though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using out=
will not copy if it's into the same array.
I forgot this was in estimator checks. Yes, not important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -2352,8 +2352,8 @@ def check_decision_proba_consistency(name, estimator_orig): | |||
hasattr(estimator, "predict_proba")): | |||
|
|||
estimator.fit(X, y) | |||
a = estimator.predict_proba(X_test)[:, 1] | |||
b = estimator.decision_function(X_test) | |||
a = estimator.predict_proba(X_test)[:, 1].round(decimals=10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be rounding as a non-copying operation? Is this copying?
sklearn/ensemble/_hist_gradient_boosting/tests/test_gradient_boosting.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once #14024 (comment) is addressed.
Comment was addressed |
Fixes #14018
Probably also #14014
Non-regression test fails on master.