FIX validate properly zero_division=np.nan when used in parallel processing #27573

glemaitre · 2023-10-11T21:07:27Z

For the classification metrics, we make a constraint check with constraints = Options(Real, {0.0, 1.0, np.nan}). The issue is that we will check if a value is in the set with np.nan is constraints. In a single process, np.nan should be the same singleton so we don't have any issue. However, in parallel process, np.nan is apparently no the same singleton and the np.nan will not be np.nan. This is indeed the case when running on of these score function (via make_scorer) within a cross-validation loop.

This PR intends to make public the _NanConstraint via the string "nan" such that we make the right check and not the use the is statement.

…essing

github-actions · 2023-10-11T21:08:30Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: c76787b. Link to the linter CI: here}

glemaitre · 2023-10-11T21:08:39Z

ping @jeremiedbb to be sure that I don't make any mistake in the testing.

sklearn/metrics/tests/test_classification.py

jeremiedbb

LGTM.

I thought that we could tweak Options to correctly handle nan detection but it does not make things simpler. This solution is simple and not confusing so let's go for it.

glemaitre · 2023-10-12T12:10:38Z

I thing that the failure was not linked (negative score_time), I will relaunch this specific build.

…essing (scikit-learn#27573)

…essing (#27573)

…essing (scikit-learn#27573)

FIX validate properly zero_division=np.nan when used in parallel proc…

8bee2b7

…essing

github-actions bot added module:metrics module:utils labels Oct 11, 2023

change pr number

1aeef4a

betatim reviewed Oct 12, 2023

View reviewed changes

sklearn/metrics/tests/test_classification.py Show resolved Hide resolved

make sure to raise

c76787b

jeremiedbb approved these changes Oct 12, 2023

View reviewed changes

betatim merged commit 5444030 into scikit-learn:main Oct 13, 2023

glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Oct 17, 2023

FIX validate properly zero_division=np.nan when used in parallel proc…

c39c2bb

…essing (scikit-learn#27573)

glemaitre added a commit that referenced this pull request Oct 23, 2023

FIX validate properly zero_division=np.nan when used in parallel proc…

d53756e

…essing (#27573)

glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Oct 31, 2023

FIX validate properly zero_division=np.nan when used in parallel proc…

002719b

…essing (scikit-learn#27573)

REDVM pushed a commit to REDVM/scikit-learn that referenced this pull request Nov 16, 2023

FIX validate properly zero_division=np.nan when used in parallel proc…

fda94f9

…essing (scikit-learn#27573)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX validate properly zero_division=np.nan when used in parallel processing #27573

FIX validate properly zero_division=np.nan when used in parallel processing #27573

glemaitre commented Oct 11, 2023

github-actions bot commented Oct 11, 2023 •

edited

Loading

glemaitre commented Oct 11, 2023

jeremiedbb left a comment

glemaitre commented Oct 12, 2023 •

edited

Loading

FIX validate properly zero_division=np.nan when used in parallel processing #27573

FIX validate properly zero_division=np.nan when used in parallel processing #27573

Conversation

glemaitre commented Oct 11, 2023

github-actions bot commented Oct 11, 2023 • edited Loading

✔️ Linting Passed

glemaitre commented Oct 11, 2023

jeremiedbb left a comment

Choose a reason for hiding this comment

glemaitre commented Oct 12, 2023 • edited Loading

github-actions bot commented Oct 11, 2023 •

edited

Loading

glemaitre commented Oct 12, 2023 •

edited

Loading