Closed
Description
Describe the bug
This is an unexpected (and I would argue undesirable) behavior change introduced in 1.2.0 by #25080
The issue is that check_array
applied to a pandas series of dtype bool
upcasts the returned series to dtype float64
. I would guess that there is related upcasting behavior for other numeric dtypes. This is a change from version 1.1.3 with the potential to cause unexpected downstream failures (I found it because I tried to use the invert operator ~
on the series returned by check_array
, which works for bool
but not float64
).
Steps/Code to Reproduce
from sklearn.utils import check_array
import pandas as pd
ser = pd.Series([False, True])
Expected Results
I would expect the dtype to be preserved (it is preserved in 1.1.3
)
> print(check_array(ser, ensure_2d=False, force_all_finite=False, dtype=None).dtype)
bool
Actual Results
The series is upcast from bool
to float64
:
> print(check_array(ser, ensure_2d=False, force_all_finite=False, dtype=None).dtype)
float64
Versions
System:
python: 3.9.14 (main, Oct 14 2022, 16:22:46) [Clang 14.0.0 (clang-1400.0.29.102)]
executable: /Users/ben.fogelson/.pyenv/versions/sklearn-bug/bin/python
machine: macOS-12.6.1-x86_64-i386-64bit
Python dependencies:
sklearn: 1.2.0
pip: 22.3.1
setuptools: 65.6.3
numpy: 1.23.5
scipy: 1.9.3
Cython: None
pandas: 1.5.2
matplotlib: None
joblib: 1.2.0
threadpoolctl: 3.1.0
Built with OpenMP: True
threadpoolctl info:
user_api: openmp
internal_api: openmp
prefix: libomp
filepath: /Users/ben.fogelson/.pyenv/versions/3.9.14/envs/sklearn-bug/lib/python3.9/site-packages/sklearn/.dylibs/libomp.dylib
version: None
num_threads: 12
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: /Users/ben.fogelson/.pyenv/versions/3.9.14/envs/sklearn-bug/lib/python3.9/site-packages/numpy/.dylibs/libopenblas64_.0.dylib
version: 0.3.20
threading_layer: pthreads
architecture: Haswell
num_threads: 6
user_api: blas
internal_api: openblas
prefix: libopenblas
filepath: /Users/ben.fogelson/.pyenv/versions/3.9.14/envs/sklearn-bug/lib/python3.9/site-packages/scipy/.dylibs/libopenblas.0.dylib
version: 0.3.18
threading_layer: pthreads
architecture: Haswell
num_threads: 6