Skip to content

FIX handle VisibleDeprecationWarning in type_of_target and is_multilabel #18423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 29, 2020

Conversation

OlehKSS
Copy link
Contributor

@OlehKSS OlehKSS commented Sep 17, 2020

Reference Issues/PRs

Fix #18367

What does this implement/fix? Explain your changes.

Fix numpy.VisibleDeprecationWarning in type_of_target, is_multilabel.
This warning was issued by numpy.asarray because of ragged/jagged sequences that were provided to this function. It was spotted in the following unit tests:

  • test_raise_value_error_multilabel_sequences:
  • test__check_targets
  • test_unique_labels_non_specific
  • test_is_multilabel
  • test_check_classification_targets
  • test_type_of_target

Any other comments?

As far as I see, there are several ways to solve it:

  1. Not provide ragged/jagged sequences failing unit tests, use numpy.array instead. In that case, only users will see this warning
  2. Treat any input sequence as ragged/jagged, i.e. always call numpy.asarray(, dtype=object)
  3. Test where an input sequence is ragged/jagged and then call numpy.asarray correctly.

I decided to it as in the 3rd list item. Here, I assumed that it'd better let NumPy do this test and then catch that warning as an exception. Not sure whether it is the optimal way to do it. What do you think?

Copy link
Member

@alfaro96 alfaro96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @OlehKSS,

Thank you for your PR.

I think we should comment why y = np.asarray(y, dtype=object) is required. Maybe something like:

# Specify dtype=object type for ragged arrays because
# NEP 34 disallow inferring these arrays from sequences

Moreover, we should include a TODO note to fix this issue when changing the behaviour to raise a ValueError instead.

@OlehKSS OlehKSS force-pushed the issue-18367-multiclass branch from 99ae8b8 to 4301144 Compare September 26, 2020 16:18
@OlehKSS
Copy link
Contributor Author

OlehKSS commented Sep 26, 2020

Hi, @alfaro96
Thanks for the review, I've added comments as you suggested.

@cmarmo cmarmo added this to the 0.24 milestone Oct 20, 2020
@cmarmo
Copy link
Contributor

cmarmo commented Oct 20, 2020

Same as #18414, milestoned as 0.24 in order to update tests with respect to newer versions of numpy.

@@ -137,8 +137,20 @@ def is_multilabel(y):
>>> is_multilabel(np.array([[1, 0, 0]]))
True
"""
import warnings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should import warnings on the top of the file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

if hasattr(y, '__array__') or isinstance(y, Sequence):
y = np.asarray(y)
# TODO: Replace the warning context manager with a try-except statement
# DeprecationWarning will be replaced by ValueError, see NEP 34
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# DeprecationWarning will be replaced by ValueError, see NEP 34
# DeprecationWarning will be replaced by ValueError, see NEP 34
# https://numpy.org/neps/nep-0034-infer-dtype-is-object.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@@ -236,6 +248,8 @@ def type_of_target(y):
>>> type_of_target(np.array([[0, 1], [1, 1]]))
'multilabel-indicator'
"""
import warnings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@@ -250,7 +264,16 @@ def type_of_target(y):
if is_multilabel(y):
return 'multilabel-indicator'

y = np.asarray(y)
# TODO: Replace the warning context manager with a try-except statement
# DeprecationWarning will be replaced by ValueError, see NEP 34
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# DeprecationWarning will be replaced by ValueError, see NEP 34
# DeprecationWarning will be replaced by ValueError, see NEP 34
# https://numpy.org/neps/nep-0034-infer-dtype-is-object.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@glemaitre glemaitre changed the title Fix VisibleDeprecationWarning in type_of_target, is_multilabel FIX handle VisibleDeprecationWarning in type_of_target and is_multilabel Oct 28, 2020
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise LGTM

@OlehKSS
Copy link
Contributor Author

OlehKSS commented Oct 29, 2020

@glemaitre I have updated this pull request according to your suggestions

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @OlehKSS !

LGTM

@thomasjpfan thomasjpfan merged commit 4c547b6 into scikit-learn:master Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BLD numpy VisibleDeprecationWarning in test suite
5 participants