Skip to content

FIX utils.multiclass.type_of_target with numpy 1.24 dev #24044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Aug 4, 2022

Conversation

lesteve
Copy link
Member

@lesteve lesteve commented Jul 29, 2022

Seen in #23626.

In numpy 1.24dev, np.array([[1], [1, 2]]) raises a ValueError you need to specify dtype=object explicitly.

See https://numpy.org/neps/nep-0034-infer-dtype-is-object.html for more details.

I think this was an oversight in #18423.

Not super familiar with the sklearn.utils.multicass.type_of_target/is_multilabel details, but I am wondering whether we could simplify the code and use y = np.asarray(y, dtype=object). Maybe we rely on the inferred dtype in the non-ragged array-like case. Edit: looks like we are since pytest sklearn/utils/tests/test_multiclass.py fails when trying to use my naive simplification.

@lesteve lesteve changed the title FIX utils.multiclass.type_of_target after numpy NEP 34 implementation FIX utils.multiclass.type_of_target in numpy 1.24dev Jul 29, 2022
@lesteve lesteve changed the title FIX utils.multiclass.type_of_target in numpy 1.24dev FIX utils.multiclass.type_of_target with numpy 1.24 dev Jul 29, 2022
@lesteve
Copy link
Member Author

lesteve commented Jul 29, 2022

The scipy-dev build fails since there are still some errors to fix there, but there are no more errors like the one below which shows that the fix works for numpy 1.24dev:

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

Copy link
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we actually support jagged arrays as a target. After the jagged array is created, I think we always raise a more detailed error. Some of this behavior is tested in test_raise_value_error_multilabel_sequences.

As a quick fix, I am okay with this PR.

@thomasjpfan thomasjpfan modified the milestone: 1.1.2 Aug 2, 2022
@glemaitre glemaitre merged commit cd14723 into scikit-learn:main Aug 4, 2022
@glemaitre
Copy link
Member

Thanks @lesteve

glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Aug 4, 2022
glemaitre pushed a commit that referenced this pull request Aug 5, 2022
Co-authored-by: Julien Jerphanion <git@jjerphan.xyz>
@lesteve lesteve deleted the fix-type-of-target branch August 8, 2022 06:51
glemaitre pushed a commit to glemaitre/scikit-learn that referenced this pull request Sep 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants