Skip to content

FIX validate properly zero_division=np.nan when used in parallel processing #27573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 13, 2023

Conversation

glemaitre
Copy link
Member

closes #27563

For the classification metrics, we make a constraint check with constraints = Options(Real, {0.0, 1.0, np.nan}). The issue is that we will check if a value is in the set with np.nan is constraints. In a single process, np.nan should be the same singleton so we don't have any issue. However, in parallel process, np.nan is apparently no the same singleton and the np.nan will not be np.nan. This is indeed the case when running on of these score function (via make_scorer) within a cross-validation loop.

This PR intends to make public the _NanConstraint via the string "nan" such that we make the right check and not the use the is statement.

@github-actions
Copy link

github-actions bot commented Oct 11, 2023

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

Generated for commit: c76787b. Link to the linter CI: here

@glemaitre
Copy link
Member Author

ping @jeremiedbb to be sure that I don't make any mistake in the testing.

Copy link
Member

@jeremiedbb jeremiedbb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I thought that we could tweak Options to correctly handle nan detection but it does not make things simpler. This solution is simple and not confusing so let's go for it.

@glemaitre
Copy link
Member Author

glemaitre commented Oct 12, 2023

I thing that the failure was not linked (negative score_time), I will relaunch this specific build.

@betatim betatim merged commit 5444030 into scikit-learn:main Oct 13, 2023
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Oct 17, 2023
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Oct 31, 2023
REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants