-
-
Notifications
You must be signed in to change notification settings - Fork 25.8k
FIX validate properly zero_division=np.nan when used in parallel processing #27573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ping @jeremiedbb to be sure that I don't make any mistake in the testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
I thought that we could tweak Options
to correctly handle nan detection but it does not make things simpler. This solution is simple and not confusing so let's go for it.
I thing that the failure was not linked (negative |
closes #27563
For the classification metrics, we make a constraint check with
constraints = Options(Real, {0.0, 1.0, np.nan})
. The issue is that we will check if a value is in the set withnp.nan is constraints
. In a single process,np.nan
should be the same singleton so we don't have any issue. However, in parallel process,np.nan
is apparently no the same singleton and thenp.nan
will not benp.nan
. This is indeed the case when running on of these score function (viamake_scorer
) within a cross-validation loop.This PR intends to make public the
_NanConstraint
via the string"nan"
such that we make the right check and not the use theis
statement.