You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to be able to pass the nullable pandas dtypes ("Int64", "Float64", "boolean") into sklearn's unique_labels function. Because the dtypes become object dtype when converted to numpy arrays we get ValueError: Mix type of y not allowed, got types {'binary', 'unknown'}:
Repro with sklearn 1.2.1
importpandasaspdimportpytestfromsklearn.utils.multiclassimportunique_labelsfordtypein ["Int64", "Float64", "boolean"]:
y_true=pd.Series([1, 0, 0, 1, 0, 1, 1, 0, 1], dtype=dtype)
y_predicted=pd.Series([0, 0, 1, 1, 0, 1, 1, 1, 1], dtype="int64")
withpytest.raises(ValueError, match="Mix type of y not allowed, got types"):
unique_labels(y_true, y_predicted)
Describe your proposed solution
We should get the same behavior as when int64, float64, and bool dtypes are used, which is no error:
Scikit-learn already has check_array that handles converting pandas nullable dtypes into their corresponding NumPy dytpes, so I opened #25638 to use check_array to fix this issue.
Describe the workflow you want to enable
I would like to be able to pass the nullable pandas dtypes ("Int64", "Float64", "boolean") into sklearn's
unique_labels
function. Because the dtypes becomeobject
dtype when converted to numpy arrays we getValueError: Mix type of y not allowed, got types {'binary', 'unknown'}
:Repro with sklearn 1.2.1
Describe your proposed solution
We should get the same behavior as when
int64
,float64
, andbool
dtypes are used, which is no error:Describe alternatives you've considered, if relevant
Our current workaround is to convert the data to numpy arrays with the corresponding dtype that works prior to passing it into
unique_labels
.Additional context
No response
The text was updated successfully, but these errors were encountered: